From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7A1D51DA22 for ; Sat, 11 Nov 2023 21:13:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="OWX2M4+4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B7A26C433C7; Sat, 11 Nov 2023 21:13:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1699737230; bh=iuccGdIyxsiYDhnuydqe/QOT6GrWtBjJu8gbkTAfhFw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=OWX2M4+4dp7ZHgAUSh0NcwS/n9+sFXvThb7y70QW0S/gdkioN/uZHW75+f1GcvqwM VeWOg65U5whVzwnhSSVtoiV9hd2uKe7cBRhjSsdTQxDK9zOUgfxOQ5tTJPf8CTDHI2 sB2f+tVt/IVz/UhdhVxyvhMNCiCa5ag0b1rcy7KmV1j1G2+5omSs16EJxNZR6WmqSF OKqr/mRcqPsgJEyp4Xpn96wZV6EoK7Glpy5KJSYnvNmGtkBrm+htg/lNA4OyZ2143r 3k7auuOEcgc1h22a7dteBom85Z72VcmtBvpRK8h9ulaYo0YvkLxSkYdrN8Cgb8P1LA TxOqOO6vRfqxA== Date: Sat, 11 Nov 2023 22:13:40 +0100 From: Alejandro Colomar To: Paul Eggert Cc: Jonny Grant , Matthew House , linux-man , GNU C Library Subject: Re: strncpy clarify result may not be null terminated Message-ID: References: <20231109031345.245703-1-mattlloydhouse@gmail.com> <250e0401-2eaa-461f-ae20-a7f44d0bc5ad@jguk.org> <49daa0a7-291a-44f3-a2dd-cf5fb26c6df2@cs.ucla.edu> Precedence: bulk X-Mailing-List: linux-man@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="sHKW1nyPt+KqsHWU" Content-Disposition: inline In-Reply-To: <49daa0a7-291a-44f3-a2dd-cf5fb26c6df2@cs.ucla.edu> --sHKW1nyPt+KqsHWU Content-Type: text/plain; protected-headers=v1; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Date: Sat, 11 Nov 2023 22:13:40 +0100 From: Alejandro Colomar To: Paul Eggert Cc: Jonny Grant , Matthew House , linux-man , GNU C Library Subject: Re: strncpy clarify result may not be null terminated Hi Paul, On Fri, Nov 10, 2023 at 02:14:13PM -0800, Paul Eggert wrote: > On 2023-11-10 11:52, Alejandro Colomar wrote: >=20 > > Do you have any numbers? >=20 > It depends on size of course. With programs like 'tar' (one of the few > programs that actually needs something like strncpy) the destination buff= er > is usually fairly small (32 bytes or less) though some of them are 100 > bytes. I used 16 bytes in the following shell transcript: >=20 > $ for i in strnlen+strcpy strnlen+memcpy strncpy stpncpy strlcpy; do echo; > echo $i:; time ./a.out 16 100000000 abcdefghijk $i; done >=20 > strnlen+strcpy: >=20 > real 0m0.411s > user 0m0.411s > sys 0m0.000s >=20 > strnlen+memcpy: >=20 > real 0m0.392s > user 0m0.388s > sys 0m0.004s >=20 > strncpy: >=20 > real 0m0.300s > user 0m0.300s > sys 0m0.000s >=20 > stpncpy: >=20 > real 0m0.326s > user 0m0.326s > sys 0m0.000s >=20 > strlcpy: >=20 > real 0m0.623s > user 0m0.623s > sys 0m0.000s >=20 >=20 > ... where a.out was generated by compiling the attached program with gcc = -O2 > on Ubuntu 23.10 64-bit on a Xeon W-1350. >=20 > I wouldn't take these numbers all that seriously, as microbenchmarks like > these are not that informative these days. Still, for a typical case one > should not assume strncpy must be slower merely because it has more work = to > do; quite the contrary. Thanks for the benchmarck! Yeah, I won't take it as the last word, but it shows the growth order (and its cause) of the different alternatives. I'd like to point out some curious things about it: - strnlen+strcpy is slower than strnlen+memcpy. The compiler has all the information necessary here, so I don't see why it's not optimizing out the strcpy(3) into a simple memcpy(3). AFAICS, it's a missed optimization. Even with -O3, it misses the optimization. - strncpy is slower than stpncpy in my computer. stpncpy is in fact the fastest call in my computer. Was strncpy(3) optimized in a recent version of glibc that you have? I'm using Debian Sid on an underclocked i9-13900T. Or is it maybe just luck? I'm curious. $ for i in strnlen+strcpy strnlen+memcpy strncpy stpncpy memccpy strlcpy; = do echo; echo $i:; time ./a.out 16 100000000 abcdefghijk $i; done; strnlen+strcpy: real 0m0.188s user 0m0.184s sys 0m0.004s strnlen+memcpy: real 0m0.148s user 0m0.148s sys 0m0.000s strncpy: real 0m0.157s user 0m0.157s sys 0m0.000s stpncpy: real 0m0.135s user 0m0.135s sys 0m0.000s memccpy: real 0m0.208s user 0m0.208s sys 0m0.000s strlcpy: real 0m0.322s user 0m0.322s sys 0m0.000s - strlcpy(3) is very heavy. Much more than I expected. See some tests with larger strings. The main growth of strlcpy(3) comes from slen. $ for i in strnlen+strcpy strnlen+memcpy strncpy stpncpy memccpy strlcpy; = do echo; echo $i:; time ./a.out 64 100000000 aaaabbbbaaaaccccaaaabbbbaaaadddd $i; done; strnlen+strcpy: real 0m0.242s user 0m0.242s sys 0m0.000s strnlen+memcpy: real 0m0.190s user 0m0.186s sys 0m0.004s strncpy: real 0m0.174s user 0m0.173s sys 0m0.000s stpncpy: real 0m0.170s user 0m0.166s sys 0m0.004s memccpy: real 0m0.253s user 0m0.249s sys 0m0.004s strlcpy: real 0m1.385s user 0m1.385s sys 0m0.000s - strncpy(3) also gets heavy compared to strnlen+memcpy. Considering how small the difference with memcpy is for small strings, I wouldn't recommend it instead of memcpy, except for micro-optimizations. The main growth of strncpy(3) comes from dsize. $ for i in strnlen+strcpy strnlen+memcpy strncpy stpncpy memccpy strlcpy; = do echo; echo $i:; time ./a.out 256 100000000 aaaabbbbaaaaccccaaaabbbbaaaadddd $i; done; strnlen+strcpy: real 0m0.234s user 0m0.233s sys 0m0.001s strnlen+memcpy: real 0m0.192s user 0m0.192s sys 0m0.000s strncpy: real 0m0.268s user 0m0.268s sys 0m0.000s stpncpy: real 0m0.267s user 0m0.267s sys 0m0.000s memccpy: real 0m0.257s user 0m0.256s sys 0m0.001s strlcpy: real 0m1.574s user 0m1.574s sys 0m0.000s $ for i in strnlen+strcpy strnlen+memcpy strncpy stpncpy memccpy strlcpy; = do echo; echo $i:; time ./a.out 4096 100000000 aaaabbbbaaaaccccaaaabbbbaaaadddd $i; done; strnlen+strcpy: real 0m0.227s user 0m0.227s sys 0m0.000s strnlen+memcpy: real 0m0.190s user 0m0.190s sys 0m0.000s strncpy: real 0m1.400s user 0m1.399s sys 0m0.000s stpncpy: real 0m1.398s user 0m1.398s sys 0m0.000s memccpy: real 0m0.256s user 0m0.256s sys 0m0.000s strlcpy: real 0m1.184s user 0m1.184s sys 0m0.000s - strnlen(3)+memcpy(3) becomes the fastest when dsize grows a bit over a few hundred bytes, and is only a few 10%'s slower than the fastest for smaller buffers. It is also the most semantically correct (together with strnlen+strcpy), avoiding unnecessary dead code (padding). This should get the main backing from the manual pages. However, it can be useful to document typical alternatives to prevent mistakes from users. Especially, since some micro-optimizations may favor uses of strncpy(3). Cheers, Alex =20 > #include > #include >=20 >=20 > int > main (int argc, char **argv) > { > if (argc !=3D 5) > return 2; > long bufsize =3D atol (argv[1]); > char *buf =3D malloc (bufsize); > long n =3D atol (argv[2]); > char const *a =3D argv[3]; > if (strcmp (argv[4], "strnlen+strcpy") =3D=3D 0) > { > for (long i =3D 0; i < n; i++) > { > if (strnlen (a, bufsize) =3D=3D bufsize) > return 1; > strcpy (buf, a); > } > } > else if (strcmp (argv[4], "strnlen+memcpy") =3D=3D 0) > { > for (long i =3D 0; i < n; i++) > { > size_t alen =3D strnlen (a, bufsize); > if (alen =3D=3D bufsize) > return 1; > memcpy (buf, a, alen + 1); > } > } > else if (strcmp (argv[4], "strncpy") =3D=3D 0) > { > for (long i =3D 0; i < n; i++) > if (strncpy (buf, a, bufsize)[bufsize - 1]) > return 1; > } > else if (strcmp (argv[4], "stpncpy") =3D=3D 0) > { > for (long i =3D 0; i < n; i++) > if (stpncpy (buf, a, bufsize) =3D=3D buf + bufsize) > return 1; > } I've added the following one for completeness. Especially now that it'll be in C2x. else if (strcmp (argv[4], "memccpy") =3D=3D 0) { for (long i =3D 0; i < n; i++) if (memccpy (buf, a, 0, bufsize) =3D=3D NULL) return 1; } > else if (strcmp (argv[4], "strlcpy") =3D=3D 0) > { > for (long i =3D 0; i < n; i++) > if (strlcpy (buf, a, bufsize) =3D=3D bufsize) This should have been >=3D bufsize, right? > return 1; > } > else > return 2; > } --=20 --sHKW1nyPt+KqsHWU Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE6jqH8KTroDDkXfJAnowa+77/2zIFAmVP7oQACgkQnowa+77/ 2zK/WBAAjgMGK/eYw2XVKJPc9ZnnWpREmNwHteMznPWxlw5r8uUe/MeHtiujaZ01 s4BuBnJQTrro8PQN8B6yK2oLcHO4VwfUIfsj4I6GSNJzAVrzs9N0EFfYV7ZoXkNk BbJggAAfWX7MfKux2tmx9B8MwnUE3Lxfk8B1SzYct2FK3HeurmuXAb1n36sGyjjH D2Cl2WA/norQZTjE62Sba2F9Ij7hMJwEbvQVrYy22hrUP29pohX7ZayoYtl57x1b 3lg3/ebGoTo2yMBPdyAGB32CskBgPa00C1e/Bt00xrzNqQ/LfeaFJ3DUqtLUg7i2 GONoJ7yqrQuNkVc6cpPhWGJvroWjm41di02TpUk5hsOeTLu6qRkncp0gr2Fhe19E /lBm98hyLAc1VJk2hi475rE4zPX/TteOlIzweSvBqfengYs2VNmQzazpk9MRyfqN Blei3//877ejzsPtWYrxs6oaAFp17BQCs70EG3FbU4zvflb1Z8/HUB00xUPnQDja y+zV4cCOn8F+XxVpgI78ZGYsaiug65oGRQ3d5Kz1LF/x/6GRbQSQ+nlyKzMhFg2+ n4TJjxqj7f0t0niSIk9idB7FbWe5mRu68uzVNPDxZ9j7FapvtMe1P8VNhmZm4CM5 Es2NQ7g7Y0Gu9otQ5xn4reUpKVKNH+2lGOi6rym0+ZEhT/4aZFg= =m3ub -----END PGP SIGNATURE----- --sHKW1nyPt+KqsHWU--