From: Alejandro Colomar <alx@kernel.org>
To: Matthew House <mattlloydhouse@gmail.com>
Cc: Jonny Grant <jg@jguk.org>, linux-man <linux-man@vger.kernel.org>
Subject: Re: strncpy clarify result may not be null terminated
Date: Wed, 8 Nov 2023 20:40:45 +0100 [thread overview]
Message-ID: <ZUvkPQZiep7k5536@debian> (raw)
In-Reply-To: <ZUvilH5kuQfTuZjy@debian>
[-- Attachment #1: Type: text/plain, Size: 9484 bytes --]
On Wed, Nov 08, 2023 at 08:33:34PM +0100, Alejandro Colomar wrote:
> Hi Matthew,
>
> On Tue, Nov 07, 2023 at 09:12:37PM -0500, Matthew House wrote:
> > On Tue, Nov 7, 2023 at 8:21 AM Alejandro Colomar <alx@kernel.org> wrote:
> > > On Tue, Nov 07, 2023 at 11:52:44AM +0000, Jonny Grant wrote:
> > > > We see things differently, I'm on the C standard side on this one. Would any information change your mind?
> > >
> > > It's difficult to say, but I doubt it. But let me ask you something:
> > > In what cases would you find strncpy(3) appropriate to use, and why?
> > > Maybe if I understand that it helps.
> > >
> > > Kind regards,
> > > Alex
> >
> > Man pages aren't read only by people writing new code, but also by people
> > reading and modifying existing code. And despite your preferences regarding
> > which functions ought to be used to produce strings, it's a widespread (and
> > correct) practice to produce a string from the character sequence created
> > by strncpy(3). There are two ways of doing this, either by setting the last
> > character of the destination buffer to null if you want to produce a
> > truncated string, or by testing the last character against zero if you want
> > to detect truncation and raise an error.
>
> It is not strncpy(3) who truncated, but the programmer by adding a NULL
Oops. s/NULL/NUL/
> in buff[BUFSIZ - 1]. In the following snippet, strncpy(3) will not
> truncate:
>
> char cs[3];
>
> strncpy(cs, "foo", 3);
>
> And yet your code doing if (cs[2] != '\0') { goto error; } would think
> it did. That's because you deformed strncpy(3) to implement a poor
> man's strlcpy(3).
>
>
> char cs[3];
>
> strncpy(cs, "foo", 3);
> cs[2] = '\0'; // The truncation is here, not in strncpy(3).
>
> > I'm not aware of any alternative to a strncpy(3)-based snippet for
> > producing a possibly-truncated copy of a string, except for your preferred
> > strlcpy(3) or stpecpy(3), which aren't available to anyone without a
>
> The Linux kernel has strscpy(3), which is also good, but is not
> available to user space.
>
> > brand-new glibc (nor, by extension, any applications or libraries that want
>
> libbsd has provided strlcpy(3) since basically forever. It is a very
> portable library. You don't need a brand-new glibc for having
> strlcpy(3).
>
> <https://libbsd.freedesktop.org/wiki/>
>
> > to support people without a brand-new glibc, nor any libraries that want to
> > support other platforms like Windows with only ISO C and POSIX-ish
>
> If you program for Windows, it depends. If you have POSIX available,
> you may be able to port libbsd; I don't know. In any case, I don't
> case about Windows enough. You could always write your own string-
> copying function for Windows.
>
> > functions); snprintf(3), which has the insidious flaw of not supporting
> > more than INT_MAX characters on pain of UB, and also produces a warning if
> > the compiler notices the possible truncation; or strlen(3) + min() +
> > memcpy(3) + manually adding a null terminator, which is certainly more
> > explicit in its intent, and avoids strncpy(3)'s zero-filling behavior if
> > that poses a performance problem, but similarly opens up room for
> > off-by-one errors.
>
> More than the performance problem, I'm more worried about the
> maintainability of strncpy(3). When 20 years from now, a programmer
> reading a piece of code full of strncpy(3) wants to migrate to a sane
> function like strlcpy(3) or strcpy(3), the programmer needs to
> understand if the zeroing was purposeful or just accidental. Because
> by using strlcpy(3), it may start leaking some trailing data if the
> trailing of the buffer is meaningful to some program.
>
> >
> > For the sake of reference, I looked into a few big C and C++ projects to
> > see how often a strncpy(3)-based snippet was used to produce a truncated
> > copy. I found 18 instances in glibc 2.38, 2 in util-linux 2.39.2 (in spite
> > of its custom xstrncpy() function), 61 in GNU binutils 2.41, 43 in
> > GDB 13.2, 1 in LLVM 17.0.4, 7 in CPython 3.12.0, 99 in OpenJDK 22+22,
> > 10 in .NET Runtime 7.0.13, 3 in V8 12.1.82, and 86 in Firefox 120.0. (Note
> > that I haven't filtered out vendored dependencies, so there's a little bit
> > of double-counting.) It seems like most codebases that don't ban strncpy(3)
> > use a derived snippet somewhere or another. Also, I found 3 instances in
> > glibc 2.38 and 5 instances in Firefox 120.0 of detecting truncation by
> > checking the last character.
>
> I know. I've been rewriting the code handling strings in shadow-utils
> for the last year, and ther was a lot of it. I fixed several small bugs
> in the process, so I recommend avoiding it.
>
> >
> > So these two snippets really are widespread, especially among the long tail
> > of smaller C and C++ applications and libraries that don't perform enough
> > string manipulation that it warrants creating a custom set of more-
> > foolproof wrapper functions (at least, in the opinion of their authors).
>
>
>
> > Thus, since they're not going away, it would be useful for anyone reading
> > the code to understand the concept behind how these two snippets work, that
> > the only difference between the strncpy(3)'s special "character sequence"
> > and an ordinary C string is an additional null terminator at the end of the
> > destination buffer.
>
> This is part of string_copying(7):
>
> DESCRIPTION
> Terms (and abbreviations)
> string (str)
> is a sequence of zero or more non‐null characters followed by a
> null byte.
>
> character sequence
> is a sequence of zero or more non‐null characters. A program
> should never use a character sequence where a string is required.
> However, with appropriate care, a string can be used in the place
> of a character sequence.
>
> I think that is very explicit in the difference. strncpy(3) refers to
> that page for understanding the differences, so I think it is
> documented.
>
> strncpy(3):
> CAVEATS
> The name of these functions is confusing. These functions produce a
> null‐padded character sequence, not a string (see string_copying(7)).
>
> >
> > In other words, strncpy(3) doesn't create a truncated string, but it
> > creates something which can be easily turned into to a truncated string,
> > and that's its most relevant quality for most of its uses in existing code.
> > Further, apart from snprintf(3), there's no other portable way to produce a
> > truncated string without manual arithmetic. Thus, I'd also find it
>
> Portable is relative. With libbsd, you can port to most POSIX systems.
> Windows is another story.
>
> > reasonable to highlight precisely why strncpy(3)'s output isn't a string
>
> How about this?:
>
> diff --git a/man3/stpncpy.3 b/man3/stpncpy.3
> index d4c2ce83d..c80c8b640 100644
> --- a/man3/stpncpy.3
> +++ b/man3/stpncpy.3
> @@ -108,7 +108,10 @@ .SH HISTORY
> .SH CAVEATS
> The name of these functions is confusing.
> These functions produce a null-padded character sequence,
> -not a string (see
> +not a string.
> +While strings have a terminating NUL byte,
> +character sequences do not have any terminating byte
> +(see
> .BR string_copying (7)).
> .P
> It's impossible to distinguish truncation by the result of the call,
>
>
> > (viz., the lack of a null terminator), instead of trying to insist that its
> > output is worlds apart from anything string-related, especially given the
> > volume of existing correct code that belies that notion.
>
> It is not correct code. That code is doing extra work which confuses
> maintainers. It is a lot like writing dead code, since you're writing
> zeros that nobody is reading, which confuses maintainers.
>
> Also, I've seen a lot of off-by-one bugs in calls to strncpy(3), so no,
> it's not correct code. It's rather dangerous code that just happens to
> not be vulnerable most of the time.
>
> >
> > Or, to answer your question, "It's appropriate to keep using strncpy(3) in
> > existing code where it's currently used as part of creating a truncated
> > string, and it's not especially inappropriate to use strncpy(3) in new code
> > as part of creating a truncated string, if the code must support platforms
> > without strlcpy(3) or similar, and if the resulting snippets are few enough
> > and well-commented enough that they create less mental load than creating
> > and maintaining a custom helper function."
>
> strncpy(3) calls are never well documented. Do you add a comment in
> each such call saying "this zeroing is superfluous"? Probably not.
>
> >
> > (As an aside, I find the remark in the man page that "It's impossible to
> > distinguish truncation by the result of the call" extremely misleading at
> > best, since truncation can easily be distinguished by inspecting the last
> > output character.)
>
> Again, strncpy(3)'s truncation is impossible to detect. What you can
> detect is that your construct that resembles strlcpy(3) truncates, which
> is a different thing.
>
> Thanks,
> Alex
>
> >
> > Thank you,
> > Matthew House
>
> --
> <https://www.alejandro-colomar.es/>
--
<https://www.alejandro-colomar.es/>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2023-11-08 19:40 UTC|newest]
Thread overview: 138+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-04 11:27 strncpy clarify result may not be null terminated Jonny Grant
2023-11-04 19:33 ` Alejandro Colomar
2023-11-04 21:18 ` Jonny Grant
2023-11-05 1:36 ` Alejandro Colomar
2023-11-05 21:16 ` Jonny Grant
2023-11-05 23:31 ` Alejandro Colomar
2023-11-07 11:52 ` Jonny Grant
2023-11-07 13:23 ` Alejandro Colomar
2023-11-07 14:19 ` Jonny Grant
2023-11-07 16:17 ` Alejandro Colomar
2023-11-07 17:00 ` Jonny Grant
2023-11-07 17:20 ` Alejandro Colomar
2023-11-08 6:18 ` Oskari Pirhonen
2023-11-08 9:51 ` Alejandro Colomar
2023-11-08 9:59 ` Thorsten Kukuk
2023-11-08 15:09 ` Alejandro Colomar
[not found] ` <6bcad2492ab843019aa63895beaea2ce@DB6PR04MB3255.eurprd04.prod.outlook.com>
2023-11-08 15:44 ` Thorsten Kukuk
2023-11-08 17:26 ` Adhemerval Zanella Netto
2023-11-08 14:06 ` Zack Weinberg
2023-11-08 15:07 ` Alejandro Colomar
2023-11-08 19:45 ` G. Branden Robinson
2023-11-08 21:35 ` Carlos O'Donell
2023-11-08 22:11 ` Alejandro Colomar
2023-11-08 23:31 ` Paul Eggert
2023-11-09 0:29 ` Alejandro Colomar
2023-11-09 10:13 ` Jonny Grant
2023-11-09 11:08 ` catenate vs concatenate (was: strncpy clarify result may not be null terminated) Alejandro Colomar
2023-11-09 14:06 ` catenate vs concatenate Jonny Grant
2023-11-27 14:33 ` catenate vs concatenate (was: strncpy clarify result may not be null terminated) Zack Weinberg
2023-11-27 15:08 ` Alejandro Colomar
2023-11-27 15:13 ` Alejandro Colomar
2023-11-27 16:59 ` G. Branden Robinson
2023-11-27 18:35 ` Zack Weinberg
2023-11-27 23:45 ` G. Branden Robinson
2023-11-09 11:13 ` strncpy clarify result may not be null terminated Alejandro Colomar
2023-11-09 14:05 ` Jonny Grant
2023-11-09 15:04 ` Alejandro Colomar
2023-11-08 19:04 ` DJ Delorie
2023-11-08 19:40 ` Alejandro Colomar
2023-11-08 19:58 ` DJ Delorie
2023-11-08 20:13 ` Alejandro Colomar
2023-11-08 21:07 ` DJ Delorie
2023-11-08 21:50 ` Alejandro Colomar
2023-11-08 22:17 ` [PATCH] stpncpy.3, string_copying.7: Clarify that st[rp]ncpy() do NOT produce a string Alejandro Colomar
2023-11-08 23:06 ` Paul Eggert
2023-11-08 23:28 ` DJ Delorie
2023-11-09 0:24 ` Alejandro Colomar
2023-11-09 14:11 ` Jonny Grant
2023-11-09 14:35 ` Alejandro Colomar
2023-11-09 14:47 ` Jonny Grant
2023-11-09 15:02 ` Alejandro Colomar
2023-11-09 17:30 ` DJ Delorie
2023-11-09 17:54 ` Andreas Schwab
2023-11-09 18:00 ` Alejandro Colomar
2023-11-09 19:42 ` Jonny Grant
2023-11-09 7:23 ` Oskari Pirhonen
2023-11-09 15:20 ` [PATCH v2 1/2] " Alejandro Colomar
2023-11-09 15:20 ` [PATCH v2 2/2] stpncpy.3, string.3, string_copying.7: Clarify that st[rp]ncpy() pad with null bytes Alejandro Colomar
2023-11-10 5:47 ` Oskari Pirhonen
2023-11-10 10:47 ` Alejandro Colomar
2023-11-08 2:12 ` strncpy clarify result may not be null terminated Matthew House
2023-11-08 19:33 ` Alejandro Colomar
2023-11-08 19:40 ` Alejandro Colomar [this message]
2023-11-09 3:13 ` Matthew House
2023-11-09 10:26 ` Jonny Grant
2023-11-09 10:31 ` Jonny Grant
2023-11-09 11:38 ` Alejandro Colomar
2023-11-09 12:43 ` Alejandro Colomar
2023-11-09 12:51 ` Xi Ruoyao
2023-11-09 14:01 ` Alejandro Colomar
2023-11-09 18:11 ` Paul Eggert
2023-11-09 23:48 ` Alejandro Colomar
2023-11-10 5:36 ` Paul Eggert
2023-11-10 11:05 ` Alejandro Colomar
2023-11-10 11:47 ` Alejandro Colomar
2023-11-10 17:58 ` Paul Eggert
2023-11-10 18:36 ` Alejandro Colomar
2023-11-10 20:19 ` Alejandro Colomar
2023-11-10 23:44 ` Jonny Grant
2023-11-10 19:52 ` Alejandro Colomar
2023-11-10 22:14 ` Paul Eggert
2023-11-11 21:13 ` Alejandro Colomar
2023-11-11 22:20 ` Paul Eggert
2023-11-12 9:52 ` Jonny Grant
2023-11-12 10:59 ` Alejandro Colomar
2023-11-12 20:49 ` Paul Eggert
2023-11-12 21:00 ` Alejandro Colomar
2023-11-12 21:45 ` Alejandro Colomar
2023-11-13 23:46 ` Jonny Grant
2023-11-17 21:57 ` Jonny Grant
2023-11-18 10:12 ` Alejandro Colomar
2023-11-18 23:03 ` Jonny Grant
2023-11-10 11:36 ` Jonny Grant
2023-11-10 13:15 ` Alejandro Colomar
2023-11-18 23:40 ` Jonny Grant
2023-11-20 11:56 ` Jonny Grant
2023-11-20 15:12 ` Alejandro Colomar
2023-11-20 23:08 ` Jonny Grant
2023-11-20 23:42 ` Alejandro Colomar
2023-11-10 11:23 ` Jonny Grant
2023-11-09 12:23 ` Alejandro Colomar
2023-11-09 12:35 ` Alejandro Colomar
2023-11-10 7:06 ` Oskari Pirhonen
2023-11-10 11:18 ` Alejandro Colomar
2023-11-11 7:55 ` Oskari Pirhonen
2023-11-10 16:06 ` Matthew House
2023-11-10 17:48 ` Alejandro Colomar
2023-11-13 15:01 ` Matthew House
2023-11-11 20:55 ` Jonny Grant
2023-11-11 21:15 ` Jonny Grant
2023-11-11 22:36 ` Alejandro Colomar
2023-11-11 23:19 ` Alejandro Colomar
2023-11-17 21:46 ` Jonny Grant
2023-11-18 9:37 ` PDF book of unreleased pages (was: strncpy clarify result may not be null terminated) Alejandro Colomar
2023-11-19 0:22 ` Deri
2023-11-19 1:19 ` Alejandro Colomar
2023-11-19 9:29 ` Alejandro Colomar
2023-11-19 16:21 ` Deri
2023-11-19 20:58 ` Alejandro Colomar
2023-11-20 0:46 ` G. Branden Robinson
2023-11-20 9:43 ` Alejandro Colomar
2023-11-18 9:44 ` NULL safety " Alejandro Colomar
2023-11-18 23:21 ` NULL safety Jonny Grant
2023-11-24 22:25 ` Alejandro Colomar
2023-11-25 0:57 ` Jonny Grant
2023-11-10 10:40 ` strncpy clarify result may not be null terminated Stefan Puiu
2023-11-10 11:06 ` Jonny Grant
2023-11-10 11:20 ` Alejandro Colomar
2023-11-12 9:17 ` [PATCH 0/2] Expand BUGS section of string_copying(7) Alejandro Colomar
2023-11-12 9:18 ` [PATCH 1/2] string_copying.7: BUGS: *cat(3) functions aren't always bad Alejandro Colomar
2023-11-12 9:18 ` [PATCH 2/2] string_copying.7: BUGS: Document strl{cpy,cat}(3)'s performance problems Alejandro Colomar
2023-11-12 11:26 ` [PATCH v2 0/3] Improve string_copying(7) Alejandro Colomar
2023-11-12 11:26 ` [PATCH v2 1/3] string_copying.7: BUGS: *cat(3) functions aren't always bad Alejandro Colomar
2023-11-17 21:43 ` Jonny Grant
2023-11-18 0:25 ` Signing all patches and email to this list Matthew House
2023-11-18 23:24 ` Jonny Grant
2023-11-12 11:26 ` [PATCH v2 2/3] string_copying.7: BUGS: Document strl{cpy,cat}(3)'s performance problems Alejandro Colomar
2023-11-12 11:27 ` [PATCH v2 3/3] strtcpy.3, string_copying.7: Add strtcpy(3) Alejandro Colomar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZUvkPQZiep7k5536@debian \
--to=alx@kernel.org \
--cc=jg@jguk.org \
--cc=linux-man@vger.kernel.org \
--cc=mattlloydhouse@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox