From: Alejandro Colomar <alx@kernel.org>
To: Paul Eggert <eggert@cs.ucla.edu>
Cc: Jonny Grant <jg@jguk.org>,
Matthew House <mattlloydhouse@gmail.com>,
linux-man <linux-man@vger.kernel.org>,
GNU C Library <libc-alpha@sourceware.org>
Subject: Re: strncpy clarify result may not be null terminated
Date: Sat, 11 Nov 2023 22:13:40 +0100 [thread overview]
Message-ID: <ZU_ui2gbSBrTKXnX@debian> (raw)
In-Reply-To: <49daa0a7-291a-44f3-a2dd-cf5fb26c6df2@cs.ucla.edu>
[-- Attachment #1: Type: text/plain, Size: 6922 bytes --]
Hi Paul,
On Fri, Nov 10, 2023 at 02:14:13PM -0800, Paul Eggert wrote:
> On 2023-11-10 11:52, Alejandro Colomar wrote:
>
> > Do you have any numbers?
>
> It depends on size of course. With programs like 'tar' (one of the few
> programs that actually needs something like strncpy) the destination buffer
> is usually fairly small (32 bytes or less) though some of them are 100
> bytes. I used 16 bytes in the following shell transcript:
>
> $ for i in strnlen+strcpy strnlen+memcpy strncpy stpncpy strlcpy; do echo;
> echo $i:; time ./a.out 16 100000000 abcdefghijk $i; done
>
> strnlen+strcpy:
>
> real 0m0.411s
> user 0m0.411s
> sys 0m0.000s
>
> strnlen+memcpy:
>
> real 0m0.392s
> user 0m0.388s
> sys 0m0.004s
>
> strncpy:
>
> real 0m0.300s
> user 0m0.300s
> sys 0m0.000s
>
> stpncpy:
>
> real 0m0.326s
> user 0m0.326s
> sys 0m0.000s
>
> strlcpy:
>
> real 0m0.623s
> user 0m0.623s
> sys 0m0.000s
>
>
> ... where a.out was generated by compiling the attached program with gcc -O2
> on Ubuntu 23.10 64-bit on a Xeon W-1350.
>
> I wouldn't take these numbers all that seriously, as microbenchmarks like
> these are not that informative these days. Still, for a typical case one
> should not assume strncpy must be slower merely because it has more work to
> do; quite the contrary.
Thanks for the benchmarck! Yeah, I won't take it as the last word, but
it shows the growth order (and its cause) of the different alternatives.
I'd like to point out some curious things about it:
- strnlen+strcpy is slower than strnlen+memcpy.
The compiler has all the information necessary here, so I don't see
why it's not optimizing out the strcpy(3) into a simple memcpy(3).
AFAICS, it's a missed optimization. Even with -O3, it misses the
optimization.
- strncpy is slower than stpncpy in my computer.
stpncpy is in fact the fastest call in my computer.
Was strncpy(3) optimized in a recent version of glibc that you have?
I'm using Debian Sid on an underclocked i9-13900T. Or is it maybe
just luck? I'm curious.
$ for i in strnlen+strcpy strnlen+memcpy strncpy stpncpy memccpy strlcpy; do
echo; echo $i:;
time ./a.out 16 100000000 abcdefghijk $i;
done;
strnlen+strcpy:
real 0m0.188s
user 0m0.184s
sys 0m0.004s
strnlen+memcpy:
real 0m0.148s
user 0m0.148s
sys 0m0.000s
strncpy:
real 0m0.157s
user 0m0.157s
sys 0m0.000s
stpncpy:
real 0m0.135s
user 0m0.135s
sys 0m0.000s
memccpy:
real 0m0.208s
user 0m0.208s
sys 0m0.000s
strlcpy:
real 0m0.322s
user 0m0.322s
sys 0m0.000s
- strlcpy(3) is very heavy. Much more than I expected. See some tests
with larger strings. The main growth of strlcpy(3) comes from slen.
$ for i in strnlen+strcpy strnlen+memcpy strncpy stpncpy memccpy strlcpy; do
echo; echo $i:;
time ./a.out 64 100000000 aaaabbbbaaaaccccaaaabbbbaaaadddd $i;
done;
strnlen+strcpy:
real 0m0.242s
user 0m0.242s
sys 0m0.000s
strnlen+memcpy:
real 0m0.190s
user 0m0.186s
sys 0m0.004s
strncpy:
real 0m0.174s
user 0m0.173s
sys 0m0.000s
stpncpy:
real 0m0.170s
user 0m0.166s
sys 0m0.004s
memccpy:
real 0m0.253s
user 0m0.249s
sys 0m0.004s
strlcpy:
real 0m1.385s
user 0m1.385s
sys 0m0.000s
- strncpy(3) also gets heavy compared to strnlen+memcpy.
Considering how small the difference with memcpy is for small
strings, I wouldn't recommend it instead of memcpy, except for
micro-optimizations. The main growth of strncpy(3) comes from dsize.
$ for i in strnlen+strcpy strnlen+memcpy strncpy stpncpy memccpy strlcpy; do
echo; echo $i:;
time ./a.out 256 100000000 aaaabbbbaaaaccccaaaabbbbaaaadddd $i;
done;
strnlen+strcpy:
real 0m0.234s
user 0m0.233s
sys 0m0.001s
strnlen+memcpy:
real 0m0.192s
user 0m0.192s
sys 0m0.000s
strncpy:
real 0m0.268s
user 0m0.268s
sys 0m0.000s
stpncpy:
real 0m0.267s
user 0m0.267s
sys 0m0.000s
memccpy:
real 0m0.257s
user 0m0.256s
sys 0m0.001s
strlcpy:
real 0m1.574s
user 0m1.574s
sys 0m0.000s
$ for i in strnlen+strcpy strnlen+memcpy strncpy stpncpy memccpy strlcpy; do
echo; echo $i:;
time ./a.out 4096 100000000 aaaabbbbaaaaccccaaaabbbbaaaadddd $i;
done;
strnlen+strcpy:
real 0m0.227s
user 0m0.227s
sys 0m0.000s
strnlen+memcpy:
real 0m0.190s
user 0m0.190s
sys 0m0.000s
strncpy:
real 0m1.400s
user 0m1.399s
sys 0m0.000s
stpncpy:
real 0m1.398s
user 0m1.398s
sys 0m0.000s
memccpy:
real 0m0.256s
user 0m0.256s
sys 0m0.000s
strlcpy:
real 0m1.184s
user 0m1.184s
sys 0m0.000s
- strnlen(3)+memcpy(3) becomes the fastest when dsize grows a bit over
a few hundred bytes, and is only a few 10%'s slower than the fastest
for smaller buffers.
It is also the most semantically correct (together with
strnlen+strcpy), avoiding unnecessary dead code (padding). This
should get the main backing from the manual pages.
However, it can be useful to document typical alternatives to prevent
mistakes from users. Especially, since some micro-optimizations may
favor uses of strncpy(3).
Cheers,
Alex
> #include <stdlib.h>
> #include <string.h>
>
>
> int
> main (int argc, char **argv)
> {
> if (argc != 5)
> return 2;
> long bufsize = atol (argv[1]);
> char *buf = malloc (bufsize);
> long n = atol (argv[2]);
> char const *a = argv[3];
> if (strcmp (argv[4], "strnlen+strcpy") == 0)
> {
> for (long i = 0; i < n; i++)
> {
> if (strnlen (a, bufsize) == bufsize)
> return 1;
> strcpy (buf, a);
> }
> }
> else if (strcmp (argv[4], "strnlen+memcpy") == 0)
> {
> for (long i = 0; i < n; i++)
> {
> size_t alen = strnlen (a, bufsize);
> if (alen == bufsize)
> return 1;
> memcpy (buf, a, alen + 1);
> }
> }
> else if (strcmp (argv[4], "strncpy") == 0)
> {
> for (long i = 0; i < n; i++)
> if (strncpy (buf, a, bufsize)[bufsize - 1])
> return 1;
> }
> else if (strcmp (argv[4], "stpncpy") == 0)
> {
> for (long i = 0; i < n; i++)
> if (stpncpy (buf, a, bufsize) == buf + bufsize)
> return 1;
> }
I've added the following one for completeness. Especially now that
it'll be in C2x.
else if (strcmp (argv[4], "memccpy") == 0)
{
for (long i = 0; i < n; i++)
if (memccpy (buf, a, 0, bufsize) == NULL)
return 1;
}
> else if (strcmp (argv[4], "strlcpy") == 0)
> {
> for (long i = 0; i < n; i++)
> if (strlcpy (buf, a, bufsize) == bufsize)
This should have been >= bufsize, right?
> return 1;
> }
> else
> return 2;
> }
--
<https://www.alejandro-colomar.es/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-11-11 21:13 UTC|newest]
Thread overview: 138+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-04 11:27 strncpy clarify result may not be null terminated Jonny Grant
2023-11-04 19:33 ` Alejandro Colomar
2023-11-04 21:18 ` Jonny Grant
2023-11-05 1:36 ` Alejandro Colomar
2023-11-05 21:16 ` Jonny Grant
2023-11-05 23:31 ` Alejandro Colomar
2023-11-07 11:52 ` Jonny Grant
2023-11-07 13:23 ` Alejandro Colomar
2023-11-07 14:19 ` Jonny Grant
2023-11-07 16:17 ` Alejandro Colomar
2023-11-07 17:00 ` Jonny Grant
2023-11-07 17:20 ` Alejandro Colomar
2023-11-08 6:18 ` Oskari Pirhonen
2023-11-08 9:51 ` Alejandro Colomar
2023-11-08 9:59 ` Thorsten Kukuk
2023-11-08 15:09 ` Alejandro Colomar
[not found] ` <6bcad2492ab843019aa63895beaea2ce@DB6PR04MB3255.eurprd04.prod.outlook.com>
2023-11-08 15:44 ` Thorsten Kukuk
2023-11-08 17:26 ` Adhemerval Zanella Netto
2023-11-08 14:06 ` Zack Weinberg
2023-11-08 15:07 ` Alejandro Colomar
2023-11-08 19:45 ` G. Branden Robinson
2023-11-08 21:35 ` Carlos O'Donell
2023-11-08 22:11 ` Alejandro Colomar
2023-11-08 23:31 ` Paul Eggert
2023-11-09 0:29 ` Alejandro Colomar
2023-11-09 10:13 ` Jonny Grant
2023-11-09 11:08 ` catenate vs concatenate (was: strncpy clarify result may not be null terminated) Alejandro Colomar
2023-11-09 14:06 ` catenate vs concatenate Jonny Grant
2023-11-27 14:33 ` catenate vs concatenate (was: strncpy clarify result may not be null terminated) Zack Weinberg
2023-11-27 15:08 ` Alejandro Colomar
2023-11-27 15:13 ` Alejandro Colomar
2023-11-27 16:59 ` G. Branden Robinson
2023-11-27 18:35 ` Zack Weinberg
2023-11-27 23:45 ` G. Branden Robinson
2023-11-09 11:13 ` strncpy clarify result may not be null terminated Alejandro Colomar
2023-11-09 14:05 ` Jonny Grant
2023-11-09 15:04 ` Alejandro Colomar
2023-11-08 19:04 ` DJ Delorie
2023-11-08 19:40 ` Alejandro Colomar
2023-11-08 19:58 ` DJ Delorie
2023-11-08 20:13 ` Alejandro Colomar
2023-11-08 21:07 ` DJ Delorie
2023-11-08 21:50 ` Alejandro Colomar
2023-11-08 22:17 ` [PATCH] stpncpy.3, string_copying.7: Clarify that st[rp]ncpy() do NOT produce a string Alejandro Colomar
2023-11-08 23:06 ` Paul Eggert
2023-11-08 23:28 ` DJ Delorie
2023-11-09 0:24 ` Alejandro Colomar
2023-11-09 14:11 ` Jonny Grant
2023-11-09 14:35 ` Alejandro Colomar
2023-11-09 14:47 ` Jonny Grant
2023-11-09 15:02 ` Alejandro Colomar
2023-11-09 17:30 ` DJ Delorie
2023-11-09 17:54 ` Andreas Schwab
2023-11-09 18:00 ` Alejandro Colomar
2023-11-09 19:42 ` Jonny Grant
2023-11-09 7:23 ` Oskari Pirhonen
2023-11-09 15:20 ` [PATCH v2 1/2] " Alejandro Colomar
2023-11-09 15:20 ` [PATCH v2 2/2] stpncpy.3, string.3, string_copying.7: Clarify that st[rp]ncpy() pad with null bytes Alejandro Colomar
2023-11-10 5:47 ` Oskari Pirhonen
2023-11-10 10:47 ` Alejandro Colomar
2023-11-08 2:12 ` strncpy clarify result may not be null terminated Matthew House
2023-11-08 19:33 ` Alejandro Colomar
2023-11-08 19:40 ` Alejandro Colomar
2023-11-09 3:13 ` Matthew House
2023-11-09 10:26 ` Jonny Grant
2023-11-09 10:31 ` Jonny Grant
2023-11-09 11:38 ` Alejandro Colomar
2023-11-09 12:43 ` Alejandro Colomar
2023-11-09 12:51 ` Xi Ruoyao
2023-11-09 14:01 ` Alejandro Colomar
2023-11-09 18:11 ` Paul Eggert
2023-11-09 23:48 ` Alejandro Colomar
2023-11-10 5:36 ` Paul Eggert
2023-11-10 11:05 ` Alejandro Colomar
2023-11-10 11:47 ` Alejandro Colomar
2023-11-10 17:58 ` Paul Eggert
2023-11-10 18:36 ` Alejandro Colomar
2023-11-10 20:19 ` Alejandro Colomar
2023-11-10 23:44 ` Jonny Grant
2023-11-10 19:52 ` Alejandro Colomar
2023-11-10 22:14 ` Paul Eggert
2023-11-11 21:13 ` Alejandro Colomar [this message]
2023-11-11 22:20 ` Paul Eggert
2023-11-12 9:52 ` Jonny Grant
2023-11-12 10:59 ` Alejandro Colomar
2023-11-12 20:49 ` Paul Eggert
2023-11-12 21:00 ` Alejandro Colomar
2023-11-12 21:45 ` Alejandro Colomar
2023-11-13 23:46 ` Jonny Grant
2023-11-17 21:57 ` Jonny Grant
2023-11-18 10:12 ` Alejandro Colomar
2023-11-18 23:03 ` Jonny Grant
2023-11-10 11:36 ` Jonny Grant
2023-11-10 13:15 ` Alejandro Colomar
2023-11-18 23:40 ` Jonny Grant
2023-11-20 11:56 ` Jonny Grant
2023-11-20 15:12 ` Alejandro Colomar
2023-11-20 23:08 ` Jonny Grant
2023-11-20 23:42 ` Alejandro Colomar
2023-11-10 11:23 ` Jonny Grant
2023-11-09 12:23 ` Alejandro Colomar
2023-11-09 12:35 ` Alejandro Colomar
2023-11-10 7:06 ` Oskari Pirhonen
2023-11-10 11:18 ` Alejandro Colomar
2023-11-11 7:55 ` Oskari Pirhonen
2023-11-10 16:06 ` Matthew House
2023-11-10 17:48 ` Alejandro Colomar
2023-11-13 15:01 ` Matthew House
2023-11-11 20:55 ` Jonny Grant
2023-11-11 21:15 ` Jonny Grant
2023-11-11 22:36 ` Alejandro Colomar
2023-11-11 23:19 ` Alejandro Colomar
2023-11-17 21:46 ` Jonny Grant
2023-11-18 9:37 ` PDF book of unreleased pages (was: strncpy clarify result may not be null terminated) Alejandro Colomar
2023-11-19 0:22 ` Deri
2023-11-19 1:19 ` Alejandro Colomar
2023-11-19 9:29 ` Alejandro Colomar
2023-11-19 16:21 ` Deri
2023-11-19 20:58 ` Alejandro Colomar
2023-11-20 0:46 ` G. Branden Robinson
2023-11-20 9:43 ` Alejandro Colomar
2023-11-18 9:44 ` NULL safety " Alejandro Colomar
2023-11-18 23:21 ` NULL safety Jonny Grant
2023-11-24 22:25 ` Alejandro Colomar
2023-11-25 0:57 ` Jonny Grant
2023-11-10 10:40 ` strncpy clarify result may not be null terminated Stefan Puiu
2023-11-10 11:06 ` Jonny Grant
2023-11-10 11:20 ` Alejandro Colomar
2023-11-12 9:17 ` [PATCH 0/2] Expand BUGS section of string_copying(7) Alejandro Colomar
2023-11-12 9:18 ` [PATCH 1/2] string_copying.7: BUGS: *cat(3) functions aren't always bad Alejandro Colomar
2023-11-12 9:18 ` [PATCH 2/2] string_copying.7: BUGS: Document strl{cpy,cat}(3)'s performance problems Alejandro Colomar
2023-11-12 11:26 ` [PATCH v2 0/3] Improve string_copying(7) Alejandro Colomar
2023-11-12 11:26 ` [PATCH v2 1/3] string_copying.7: BUGS: *cat(3) functions aren't always bad Alejandro Colomar
2023-11-17 21:43 ` Jonny Grant
2023-11-18 0:25 ` Signing all patches and email to this list Matthew House
2023-11-18 23:24 ` Jonny Grant
2023-11-12 11:26 ` [PATCH v2 2/3] string_copying.7: BUGS: Document strl{cpy,cat}(3)'s performance problems Alejandro Colomar
2023-11-12 11:27 ` [PATCH v2 3/3] strtcpy.3, string_copying.7: Add strtcpy(3) Alejandro Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZU_ui2gbSBrTKXnX@debian \
--to=alx@kernel.org \
--cc=eggert@cs.ucla.edu \
--cc=jg@jguk.org \
--cc=libc-alpha@sourceware.org \
--cc=linux-man@vger.kernel.org \
--cc=mattlloydhouse@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox