From: Alejandro Colomar <alx@kernel.org>
To: Matthew House <mattlloydhouse@gmail.com>
Cc: Jonny Grant <jg@jguk.org>, linux-man <linux-man@vger.kernel.org>
Subject: Re: strncpy clarify result may not be null terminated
Date: Wed, 8 Nov 2023 20:33:34 +0100 [thread overview]
Message-ID: <ZUvilH5kuQfTuZjy@debian> (raw)
In-Reply-To: <20231108021240.176996-1-mattlloydhouse@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 8926 bytes --]
Hi Matthew,
On Tue, Nov 07, 2023 at 09:12:37PM -0500, Matthew House wrote:
> On Tue, Nov 7, 2023 at 8:21 AM Alejandro Colomar <alx@kernel.org> wrote:
> > On Tue, Nov 07, 2023 at 11:52:44AM +0000, Jonny Grant wrote:
> > > We see things differently, I'm on the C standard side on this one. Would any information change your mind?
> >
> > It's difficult to say, but I doubt it. But let me ask you something:
> > In what cases would you find strncpy(3) appropriate to use, and why?
> > Maybe if I understand that it helps.
> >
> > Kind regards,
> > Alex
>
> Man pages aren't read only by people writing new code, but also by people
> reading and modifying existing code. And despite your preferences regarding
> which functions ought to be used to produce strings, it's a widespread (and
> correct) practice to produce a string from the character sequence created
> by strncpy(3). There are two ways of doing this, either by setting the last
> character of the destination buffer to null if you want to produce a
> truncated string, or by testing the last character against zero if you want
> to detect truncation and raise an error.
It is not strncpy(3) who truncated, but the programmer by adding a NULL
in buff[BUFSIZ - 1]. In the following snippet, strncpy(3) will not
truncate:
char cs[3];
strncpy(cs, "foo", 3);
And yet your code doing if (cs[2] != '\0') { goto error; } would think
it did. That's because you deformed strncpy(3) to implement a poor
man's strlcpy(3).
char cs[3];
strncpy(cs, "foo", 3);
cs[2] = '\0'; // The truncation is here, not in strncpy(3).
> I'm not aware of any alternative to a strncpy(3)-based snippet for
> producing a possibly-truncated copy of a string, except for your preferred
> strlcpy(3) or stpecpy(3), which aren't available to anyone without a
The Linux kernel has strscpy(3), which is also good, but is not
available to user space.
> brand-new glibc (nor, by extension, any applications or libraries that want
libbsd has provided strlcpy(3) since basically forever. It is a very
portable library. You don't need a brand-new glibc for having
strlcpy(3).
<https://libbsd.freedesktop.org/wiki/>
> to support people without a brand-new glibc, nor any libraries that want to
> support other platforms like Windows with only ISO C and POSIX-ish
If you program for Windows, it depends. If you have POSIX available,
you may be able to port libbsd; I don't know. In any case, I don't
case about Windows enough. You could always write your own string-
copying function for Windows.
> functions); snprintf(3), which has the insidious flaw of not supporting
> more than INT_MAX characters on pain of UB, and also produces a warning if
> the compiler notices the possible truncation; or strlen(3) + min() +
> memcpy(3) + manually adding a null terminator, which is certainly more
> explicit in its intent, and avoids strncpy(3)'s zero-filling behavior if
> that poses a performance problem, but similarly opens up room for
> off-by-one errors.
More than the performance problem, I'm more worried about the
maintainability of strncpy(3). When 20 years from now, a programmer
reading a piece of code full of strncpy(3) wants to migrate to a sane
function like strlcpy(3) or strcpy(3), the programmer needs to
understand if the zeroing was purposeful or just accidental. Because
by using strlcpy(3), it may start leaking some trailing data if the
trailing of the buffer is meaningful to some program.
>
> For the sake of reference, I looked into a few big C and C++ projects to
> see how often a strncpy(3)-based snippet was used to produce a truncated
> copy. I found 18 instances in glibc 2.38, 2 in util-linux 2.39.2 (in spite
> of its custom xstrncpy() function), 61 in GNU binutils 2.41, 43 in
> GDB 13.2, 1 in LLVM 17.0.4, 7 in CPython 3.12.0, 99 in OpenJDK 22+22,
> 10 in .NET Runtime 7.0.13, 3 in V8 12.1.82, and 86 in Firefox 120.0. (Note
> that I haven't filtered out vendored dependencies, so there's a little bit
> of double-counting.) It seems like most codebases that don't ban strncpy(3)
> use a derived snippet somewhere or another. Also, I found 3 instances in
> glibc 2.38 and 5 instances in Firefox 120.0 of detecting truncation by
> checking the last character.
I know. I've been rewriting the code handling strings in shadow-utils
for the last year, and ther was a lot of it. I fixed several small bugs
in the process, so I recommend avoiding it.
>
> So these two snippets really are widespread, especially among the long tail
> of smaller C and C++ applications and libraries that don't perform enough
> string manipulation that it warrants creating a custom set of more-
> foolproof wrapper functions (at least, in the opinion of their authors).
> Thus, since they're not going away, it would be useful for anyone reading
> the code to understand the concept behind how these two snippets work, that
> the only difference between the strncpy(3)'s special "character sequence"
> and an ordinary C string is an additional null terminator at the end of the
> destination buffer.
This is part of string_copying(7):
DESCRIPTION
Terms (and abbreviations)
string (str)
is a sequence of zero or more non‐null characters followed by a
null byte.
character sequence
is a sequence of zero or more non‐null characters. A program
should never use a character sequence where a string is required.
However, with appropriate care, a string can be used in the place
of a character sequence.
I think that is very explicit in the difference. strncpy(3) refers to
that page for understanding the differences, so I think it is
documented.
strncpy(3):
CAVEATS
The name of these functions is confusing. These functions produce a
null‐padded character sequence, not a string (see string_copying(7)).
>
> In other words, strncpy(3) doesn't create a truncated string, but it
> creates something which can be easily turned into to a truncated string,
> and that's its most relevant quality for most of its uses in existing code.
> Further, apart from snprintf(3), there's no other portable way to produce a
> truncated string without manual arithmetic. Thus, I'd also find it
Portable is relative. With libbsd, you can port to most POSIX systems.
Windows is another story.
> reasonable to highlight precisely why strncpy(3)'s output isn't a string
How about this?:
diff --git a/man3/stpncpy.3 b/man3/stpncpy.3
index d4c2ce83d..c80c8b640 100644
--- a/man3/stpncpy.3
+++ b/man3/stpncpy.3
@@ -108,7 +108,10 @@ .SH HISTORY
.SH CAVEATS
The name of these functions is confusing.
These functions produce a null-padded character sequence,
-not a string (see
+not a string.
+While strings have a terminating NUL byte,
+character sequences do not have any terminating byte
+(see
.BR string_copying (7)).
.P
It's impossible to distinguish truncation by the result of the call,
> (viz., the lack of a null terminator), instead of trying to insist that its
> output is worlds apart from anything string-related, especially given the
> volume of existing correct code that belies that notion.
It is not correct code. That code is doing extra work which confuses
maintainers. It is a lot like writing dead code, since you're writing
zeros that nobody is reading, which confuses maintainers.
Also, I've seen a lot of off-by-one bugs in calls to strncpy(3), so no,
it's not correct code. It's rather dangerous code that just happens to
not be vulnerable most of the time.
>
> Or, to answer your question, "It's appropriate to keep using strncpy(3) in
> existing code where it's currently used as part of creating a truncated
> string, and it's not especially inappropriate to use strncpy(3) in new code
> as part of creating a truncated string, if the code must support platforms
> without strlcpy(3) or similar, and if the resulting snippets are few enough
> and well-commented enough that they create less mental load than creating
> and maintaining a custom helper function."
strncpy(3) calls are never well documented. Do you add a comment in
each such call saying "this zeroing is superfluous"? Probably not.
>
> (As an aside, I find the remark in the man page that "It's impossible to
> distinguish truncation by the result of the call" extremely misleading at
> best, since truncation can easily be distinguished by inspecting the last
> output character.)
Again, strncpy(3)'s truncation is impossible to detect. What you can
detect is that your construct that resembles strlcpy(3) truncates, which
is a different thing.
Thanks,
Alex
>
> Thank you,
> Matthew House
--
<https://www.alejandro-colomar.es/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-11-08 19:33 UTC|newest]
Thread overview: 138+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-04 11:27 strncpy clarify result may not be null terminated Jonny Grant
2023-11-04 19:33 ` Alejandro Colomar
2023-11-04 21:18 ` Jonny Grant
2023-11-05 1:36 ` Alejandro Colomar
2023-11-05 21:16 ` Jonny Grant
2023-11-05 23:31 ` Alejandro Colomar
2023-11-07 11:52 ` Jonny Grant
2023-11-07 13:23 ` Alejandro Colomar
2023-11-07 14:19 ` Jonny Grant
2023-11-07 16:17 ` Alejandro Colomar
2023-11-07 17:00 ` Jonny Grant
2023-11-07 17:20 ` Alejandro Colomar
2023-11-08 6:18 ` Oskari Pirhonen
2023-11-08 9:51 ` Alejandro Colomar
2023-11-08 9:59 ` Thorsten Kukuk
2023-11-08 15:09 ` Alejandro Colomar
[not found] ` <6bcad2492ab843019aa63895beaea2ce@DB6PR04MB3255.eurprd04.prod.outlook.com>
2023-11-08 15:44 ` Thorsten Kukuk
2023-11-08 17:26 ` Adhemerval Zanella Netto
2023-11-08 14:06 ` Zack Weinberg
2023-11-08 15:07 ` Alejandro Colomar
2023-11-08 19:45 ` G. Branden Robinson
2023-11-08 21:35 ` Carlos O'Donell
2023-11-08 22:11 ` Alejandro Colomar
2023-11-08 23:31 ` Paul Eggert
2023-11-09 0:29 ` Alejandro Colomar
2023-11-09 10:13 ` Jonny Grant
2023-11-09 11:08 ` catenate vs concatenate (was: strncpy clarify result may not be null terminated) Alejandro Colomar
2023-11-09 14:06 ` catenate vs concatenate Jonny Grant
2023-11-27 14:33 ` catenate vs concatenate (was: strncpy clarify result may not be null terminated) Zack Weinberg
2023-11-27 15:08 ` Alejandro Colomar
2023-11-27 15:13 ` Alejandro Colomar
2023-11-27 16:59 ` G. Branden Robinson
2023-11-27 18:35 ` Zack Weinberg
2023-11-27 23:45 ` G. Branden Robinson
2023-11-09 11:13 ` strncpy clarify result may not be null terminated Alejandro Colomar
2023-11-09 14:05 ` Jonny Grant
2023-11-09 15:04 ` Alejandro Colomar
2023-11-08 19:04 ` DJ Delorie
2023-11-08 19:40 ` Alejandro Colomar
2023-11-08 19:58 ` DJ Delorie
2023-11-08 20:13 ` Alejandro Colomar
2023-11-08 21:07 ` DJ Delorie
2023-11-08 21:50 ` Alejandro Colomar
2023-11-08 22:17 ` [PATCH] stpncpy.3, string_copying.7: Clarify that st[rp]ncpy() do NOT produce a string Alejandro Colomar
2023-11-08 23:06 ` Paul Eggert
2023-11-08 23:28 ` DJ Delorie
2023-11-09 0:24 ` Alejandro Colomar
2023-11-09 14:11 ` Jonny Grant
2023-11-09 14:35 ` Alejandro Colomar
2023-11-09 14:47 ` Jonny Grant
2023-11-09 15:02 ` Alejandro Colomar
2023-11-09 17:30 ` DJ Delorie
2023-11-09 17:54 ` Andreas Schwab
2023-11-09 18:00 ` Alejandro Colomar
2023-11-09 19:42 ` Jonny Grant
2023-11-09 7:23 ` Oskari Pirhonen
2023-11-09 15:20 ` [PATCH v2 1/2] " Alejandro Colomar
2023-11-09 15:20 ` [PATCH v2 2/2] stpncpy.3, string.3, string_copying.7: Clarify that st[rp]ncpy() pad with null bytes Alejandro Colomar
2023-11-10 5:47 ` Oskari Pirhonen
2023-11-10 10:47 ` Alejandro Colomar
2023-11-08 2:12 ` strncpy clarify result may not be null terminated Matthew House
2023-11-08 19:33 ` Alejandro Colomar [this message]
2023-11-08 19:40 ` Alejandro Colomar
2023-11-09 3:13 ` Matthew House
2023-11-09 10:26 ` Jonny Grant
2023-11-09 10:31 ` Jonny Grant
2023-11-09 11:38 ` Alejandro Colomar
2023-11-09 12:43 ` Alejandro Colomar
2023-11-09 12:51 ` Xi Ruoyao
2023-11-09 14:01 ` Alejandro Colomar
2023-11-09 18:11 ` Paul Eggert
2023-11-09 23:48 ` Alejandro Colomar
2023-11-10 5:36 ` Paul Eggert
2023-11-10 11:05 ` Alejandro Colomar
2023-11-10 11:47 ` Alejandro Colomar
2023-11-10 17:58 ` Paul Eggert
2023-11-10 18:36 ` Alejandro Colomar
2023-11-10 20:19 ` Alejandro Colomar
2023-11-10 23:44 ` Jonny Grant
2023-11-10 19:52 ` Alejandro Colomar
2023-11-10 22:14 ` Paul Eggert
2023-11-11 21:13 ` Alejandro Colomar
2023-11-11 22:20 ` Paul Eggert
2023-11-12 9:52 ` Jonny Grant
2023-11-12 10:59 ` Alejandro Colomar
2023-11-12 20:49 ` Paul Eggert
2023-11-12 21:00 ` Alejandro Colomar
2023-11-12 21:45 ` Alejandro Colomar
2023-11-13 23:46 ` Jonny Grant
2023-11-17 21:57 ` Jonny Grant
2023-11-18 10:12 ` Alejandro Colomar
2023-11-18 23:03 ` Jonny Grant
2023-11-10 11:36 ` Jonny Grant
2023-11-10 13:15 ` Alejandro Colomar
2023-11-18 23:40 ` Jonny Grant
2023-11-20 11:56 ` Jonny Grant
2023-11-20 15:12 ` Alejandro Colomar
2023-11-20 23:08 ` Jonny Grant
2023-11-20 23:42 ` Alejandro Colomar
2023-11-10 11:23 ` Jonny Grant
2023-11-09 12:23 ` Alejandro Colomar
2023-11-09 12:35 ` Alejandro Colomar
2023-11-10 7:06 ` Oskari Pirhonen
2023-11-10 11:18 ` Alejandro Colomar
2023-11-11 7:55 ` Oskari Pirhonen
2023-11-10 16:06 ` Matthew House
2023-11-10 17:48 ` Alejandro Colomar
2023-11-13 15:01 ` Matthew House
2023-11-11 20:55 ` Jonny Grant
2023-11-11 21:15 ` Jonny Grant
2023-11-11 22:36 ` Alejandro Colomar
2023-11-11 23:19 ` Alejandro Colomar
2023-11-17 21:46 ` Jonny Grant
2023-11-18 9:37 ` PDF book of unreleased pages (was: strncpy clarify result may not be null terminated) Alejandro Colomar
2023-11-19 0:22 ` Deri
2023-11-19 1:19 ` Alejandro Colomar
2023-11-19 9:29 ` Alejandro Colomar
2023-11-19 16:21 ` Deri
2023-11-19 20:58 ` Alejandro Colomar
2023-11-20 0:46 ` G. Branden Robinson
2023-11-20 9:43 ` Alejandro Colomar
2023-11-18 9:44 ` NULL safety " Alejandro Colomar
2023-11-18 23:21 ` NULL safety Jonny Grant
2023-11-24 22:25 ` Alejandro Colomar
2023-11-25 0:57 ` Jonny Grant
2023-11-10 10:40 ` strncpy clarify result may not be null terminated Stefan Puiu
2023-11-10 11:06 ` Jonny Grant
2023-11-10 11:20 ` Alejandro Colomar
2023-11-12 9:17 ` [PATCH 0/2] Expand BUGS section of string_copying(7) Alejandro Colomar
2023-11-12 9:18 ` [PATCH 1/2] string_copying.7: BUGS: *cat(3) functions aren't always bad Alejandro Colomar
2023-11-12 9:18 ` [PATCH 2/2] string_copying.7: BUGS: Document strl{cpy,cat}(3)'s performance problems Alejandro Colomar
2023-11-12 11:26 ` [PATCH v2 0/3] Improve string_copying(7) Alejandro Colomar
2023-11-12 11:26 ` [PATCH v2 1/3] string_copying.7: BUGS: *cat(3) functions aren't always bad Alejandro Colomar
2023-11-17 21:43 ` Jonny Grant
2023-11-18 0:25 ` Signing all patches and email to this list Matthew House
2023-11-18 23:24 ` Jonny Grant
2023-11-12 11:26 ` [PATCH v2 2/3] string_copying.7: BUGS: Document strl{cpy,cat}(3)'s performance problems Alejandro Colomar
2023-11-12 11:27 ` [PATCH v2 3/3] strtcpy.3, string_copying.7: Add strtcpy(3) Alejandro Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZUvilH5kuQfTuZjy@debian \
--to=alx@kernel.org \
--cc=jg@jguk.org \
--cc=linux-man@vger.kernel.org \
--cc=mattlloydhouse@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox