public inbox for linux-man@vger.kernel.org
 help / color / mirror / Atom feed
From: Alejandro Colomar <alx@kernel.org>
To: Matthew House <mattlloydhouse@gmail.com>
Cc: Jonny Grant <jg@jguk.org>, linux-man <linux-man@vger.kernel.org>
Subject: Re: strncpy clarify result may not be null terminated
Date: Wed, 8 Nov 2023 20:40:45 +0100	[thread overview]
Message-ID: <ZUvkPQZiep7k5536@debian> (raw)
In-Reply-To: <ZUvilH5kuQfTuZjy@debian>

[-- Attachment #1: Type: text/plain, Size: 9484 bytes --]

On Wed, Nov 08, 2023 at 08:33:34PM +0100, Alejandro Colomar wrote:
> Hi Matthew,
> 
> On Tue, Nov 07, 2023 at 09:12:37PM -0500, Matthew House wrote:
> > On Tue, Nov 7, 2023 at 8:21 AM Alejandro Colomar <alx@kernel.org> wrote:
> > > On Tue, Nov 07, 2023 at 11:52:44AM +0000, Jonny Grant wrote:
> > > > We see things differently, I'm on the C standard side on this one. Would any information change your mind?
> > >
> > > It's difficult to say, but I doubt it.  But let me ask you something:
> > > In what cases would you find strncpy(3) appropriate to use, and why?
> > > Maybe if I understand that it helps.
> > >
> > > Kind regards,
> > > Alex
> > 
> > Man pages aren't read only by people writing new code, but also by people
> > reading and modifying existing code. And despite your preferences regarding
> > which functions ought to be used to produce strings, it's a widespread (and
> > correct) practice to produce a string from the character sequence created
> > by strncpy(3). There are two ways of doing this, either by setting the last
> > character of the destination buffer to null if you want to produce a
> > truncated string, or by testing the last character against zero if you want
> > to detect truncation and raise an error.
> 
> It is not strncpy(3) who truncated, but the programmer by adding a NULL

Oops.  s/NULL/NUL/

> in buff[BUFSIZ - 1].  In the following snippet, strncpy(3) will not
> truncate:
> 
> 	char cs[3];
> 
> 	strncpy(cs, "foo", 3);
> 
> And yet your code doing if (cs[2] != '\0') { goto error; } would think
> it did.  That's because you deformed strncpy(3) to implement a poor
> man's strlcpy(3).  
> 
> 
> 	char cs[3];
> 
> 	strncpy(cs, "foo", 3);
> 	cs[2] = '\0';  // The truncation is here, not in strncpy(3).
> 
> > I'm not aware of any alternative to a strncpy(3)-based snippet for
> > producing a possibly-truncated copy of a string, except for your preferred
> > strlcpy(3) or stpecpy(3), which aren't available to anyone without a
> 
> The Linux kernel has strscpy(3), which is also good, but is not
> available to user space.
> 
> > brand-new glibc (nor, by extension, any applications or libraries that want
> 
> libbsd has provided strlcpy(3) since basically forever.  It is a very
> portable library.  You don't need a brand-new glibc for having
> strlcpy(3).
> 
> <https://libbsd.freedesktop.org/wiki/>
> 
> > to support people without a brand-new glibc, nor any libraries that want to
> > support other platforms like Windows with only ISO C and POSIX-ish
> 
> If you program for Windows, it depends.  If you have POSIX available,
> you may be able to port libbsd; I don't know.  In any case, I don't
> case about Windows enough.  You could always write your own string-
> copying function for Windows.
> 
> > functions); snprintf(3), which has the insidious flaw of not supporting
> > more than INT_MAX characters on pain of UB, and also produces a warning if
> > the compiler notices the possible truncation; or strlen(3) + min() +
> > memcpy(3) + manually adding a null terminator, which is certainly more
> > explicit in its intent, and avoids strncpy(3)'s zero-filling behavior if
> > that poses a performance problem, but similarly opens up room for
> > off-by-one errors.
> 
> More than the performance problem, I'm more worried about the
> maintainability of strncpy(3).  When 20 years from now, a programmer
> reading a piece of code full of strncpy(3) wants to migrate to a sane
> function like strlcpy(3) or strcpy(3), the programmer needs to
> understand if the zeroing was purposeful or just accidental.  Because
> by using strlcpy(3), it may start leaking some trailing data if the
> trailing of the buffer is meaningful to some program.
> 
> > 
> > For the sake of reference, I looked into a few big C and C++ projects to
> > see how often a strncpy(3)-based snippet was used to produce a truncated
> > copy. I found 18 instances in glibc 2.38, 2 in util-linux 2.39.2 (in spite
> > of its custom xstrncpy() function), 61 in GNU binutils 2.41, 43 in
> > GDB 13.2, 1 in LLVM 17.0.4, 7 in CPython 3.12.0, 99 in OpenJDK 22+22,
> > 10 in .NET Runtime 7.0.13, 3 in V8 12.1.82, and 86 in Firefox 120.0. (Note
> > that I haven't filtered out vendored dependencies, so there's a little bit
> > of double-counting.) It seems like most codebases that don't ban strncpy(3)
> > use a derived snippet somewhere or another. Also, I found 3 instances in
> > glibc 2.38 and 5 instances in Firefox 120.0 of detecting truncation by
> > checking the last character.
> 
> I know.  I've been rewriting the code handling strings in shadow-utils
> for the last year, and ther was a lot of it.  I fixed several small bugs
> in the process, so I recommend avoiding it.
> 
> > 
> > So these two snippets really are widespread, especially among the long tail
> > of smaller C and C++ applications and libraries that don't perform enough
> > string manipulation that it warrants creating a custom set of more-
> > foolproof wrapper functions (at least, in the opinion of their authors).
> 
> 
> 
> > Thus, since they're not going away, it would be useful for anyone reading
> > the code to understand the concept behind how these two snippets work, that
> > the only difference between the strncpy(3)'s special "character sequence"
> > and an ordinary C string is an additional null terminator at the end of the
> > destination buffer.
> 
> This is part of string_copying(7):
> 
> DESCRIPTION
>    Terms (and abbreviations)
>      string (str)
>             is  a  sequence  of zero or more non‐null characters followed by a
>             null byte.
> 
>      character sequence
>             is a sequence of zero or  more  non‐null  characters.   A  program
>             should  never use a character sequence where a string is required.
>             However, with appropriate care, a string can be used in the  place
>             of a character sequence.
> 
> I think that is very explicit in the difference.  strncpy(3) refers to
> that page for understanding the differences, so I think it is
> documented.
> 
> strncpy(3):
> CAVEATS
>      The  name  of  these  functions  is confusing.  These functions produce a
>      null‐padded character sequence, not a string (see string_copying(7)).
> 
> > 
> > In other words, strncpy(3) doesn't create a truncated string, but it
> > creates something which can be easily turned into to a truncated string,
> > and that's its most relevant quality for most of its uses in existing code.
> > Further, apart from snprintf(3), there's no other portable way to produce a
> > truncated string without manual arithmetic. Thus, I'd also find it
> 
> Portable is relative.  With libbsd, you can port to most POSIX systems.
> Windows is another story.
> 
> > reasonable to highlight precisely why strncpy(3)'s output isn't a string
> 
> How about this?:
> 
> diff --git a/man3/stpncpy.3 b/man3/stpncpy.3
> index d4c2ce83d..c80c8b640 100644
> --- a/man3/stpncpy.3
> +++ b/man3/stpncpy.3
> @@ -108,7 +108,10 @@ .SH HISTORY
>  .SH CAVEATS
>  The name of these functions is confusing.
>  These functions produce a null-padded character sequence,
> -not a string (see
> +not a string.
> +While strings have a terminating NUL byte,
> +character sequences do not have any terminating byte
> +(see
>  .BR string_copying (7)).
>  .P
>  It's impossible to distinguish truncation by the result of the call,
> 
> 
> > (viz., the lack of a null terminator), instead of trying to insist that its
> > output is worlds apart from anything string-related, especially given the
> > volume of existing correct code that belies that notion.
> 
> It is not correct code.  That code is doing extra work which confuses
> maintainers.  It is a lot like writing dead code, since you're writing
> zeros that nobody is reading, which confuses maintainers.
> 
> Also, I've seen a lot of off-by-one bugs in calls to strncpy(3), so no,
> it's not correct code.  It's rather dangerous code that just happens to
> not be vulnerable most of the time.
> 
> > 
> > Or, to answer your question, "It's appropriate to keep using strncpy(3) in
> > existing code where it's currently used as part of creating a truncated
> > string, and it's not especially inappropriate to use strncpy(3) in new code
> > as part of creating a truncated string, if the code must support platforms
> > without strlcpy(3) or similar, and if the resulting snippets are few enough
> > and well-commented enough that they create less mental load than creating
> > and maintaining a custom helper function."
> 
> strncpy(3) calls are never well documented.  Do you add a comment in
> each such call saying "this zeroing is superfluous"?  Probably not.
> 
> > 
> > (As an aside, I find the remark in the man page that "It's impossible to
> > distinguish truncation by the result of the call" extremely misleading at
> > best, since truncation can easily be distinguished by inspecting the last
> > output character.)
> 
> Again, strncpy(3)'s truncation is impossible to detect.  What you can
> detect is that your construct that resembles strlcpy(3) truncates, which
> is a different thing.
> 
> Thanks,
> Alex
> 
> > 
> > Thank you,
> > Matthew House
> 
> -- 
> <https://www.alejandro-colomar.es/>



-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-11-08 19:40 UTC|newest]

Thread overview: 138+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-04 11:27 strncpy clarify result may not be null terminated Jonny Grant
2023-11-04 19:33 ` Alejandro Colomar
2023-11-04 21:18   ` Jonny Grant
2023-11-05  1:36     ` Alejandro Colomar
2023-11-05 21:16   ` Jonny Grant
2023-11-05 23:31     ` Alejandro Colomar
2023-11-07 11:52       ` Jonny Grant
2023-11-07 13:23         ` Alejandro Colomar
2023-11-07 14:19           ` Jonny Grant
2023-11-07 16:17             ` Alejandro Colomar
2023-11-07 17:00               ` Jonny Grant
2023-11-07 17:20                 ` Alejandro Colomar
2023-11-08  6:18               ` Oskari Pirhonen
2023-11-08  9:51                 ` Alejandro Colomar
2023-11-08  9:59                   ` Thorsten Kukuk
2023-11-08 15:09                     ` Alejandro Colomar
     [not found]                     ` <6bcad2492ab843019aa63895beaea2ce@DB6PR04MB3255.eurprd04.prod.outlook.com>
2023-11-08 15:44                       ` Thorsten Kukuk
2023-11-08 17:26                         ` Adhemerval Zanella Netto
2023-11-08 14:06                   ` Zack Weinberg
2023-11-08 15:07                     ` Alejandro Colomar
2023-11-08 19:45                       ` G. Branden Robinson
2023-11-08 21:35                       ` Carlos O'Donell
2023-11-08 22:11                         ` Alejandro Colomar
2023-11-08 23:31                           ` Paul Eggert
2023-11-09  0:29                             ` Alejandro Colomar
2023-11-09 10:13                               ` Jonny Grant
2023-11-09 11:08                                 ` catenate vs concatenate (was: strncpy clarify result may not be null terminated) Alejandro Colomar
2023-11-09 14:06                                   ` catenate vs concatenate Jonny Grant
2023-11-27 14:33                                   ` catenate vs concatenate (was: strncpy clarify result may not be null terminated) Zack Weinberg
2023-11-27 15:08                                     ` Alejandro Colomar
2023-11-27 15:13                                       ` Alejandro Colomar
2023-11-27 16:59                                       ` G. Branden Robinson
2023-11-27 18:35                                         ` Zack Weinberg
2023-11-27 23:45                                           ` G. Branden Robinson
2023-11-09 11:13                                 ` strncpy clarify result may not be null terminated Alejandro Colomar
2023-11-09 14:05                                   ` Jonny Grant
2023-11-09 15:04                                     ` Alejandro Colomar
2023-11-08 19:04                   ` DJ Delorie
2023-11-08 19:40                     ` Alejandro Colomar
2023-11-08 19:58                       ` DJ Delorie
2023-11-08 20:13                         ` Alejandro Colomar
2023-11-08 21:07                           ` DJ Delorie
2023-11-08 21:50                             ` Alejandro Colomar
2023-11-08 22:17                               ` [PATCH] stpncpy.3, string_copying.7: Clarify that st[rp]ncpy() do NOT produce a string Alejandro Colomar
2023-11-08 23:06                                 ` Paul Eggert
2023-11-08 23:28                                   ` DJ Delorie
2023-11-09  0:24                                   ` Alejandro Colomar
2023-11-09 14:11                                   ` Jonny Grant
2023-11-09 14:35                                     ` Alejandro Colomar
2023-11-09 14:47                                       ` Jonny Grant
2023-11-09 15:02                                         ` Alejandro Colomar
2023-11-09 17:30                                           ` DJ Delorie
2023-11-09 17:54                                             ` Andreas Schwab
2023-11-09 18:00                                             ` Alejandro Colomar
2023-11-09 19:42                                             ` Jonny Grant
2023-11-09  7:23                                 ` Oskari Pirhonen
2023-11-09 15:20                                 ` [PATCH v2 1/2] " Alejandro Colomar
2023-11-09 15:20                                 ` [PATCH v2 2/2] stpncpy.3, string.3, string_copying.7: Clarify that st[rp]ncpy() pad with null bytes Alejandro Colomar
2023-11-10  5:47                                   ` Oskari Pirhonen
2023-11-10 10:47                                     ` Alejandro Colomar
2023-11-08  2:12           ` strncpy clarify result may not be null terminated Matthew House
2023-11-08 19:33             ` Alejandro Colomar
2023-11-08 19:40               ` Alejandro Colomar [this message]
2023-11-09  3:13               ` Matthew House
2023-11-09 10:26                 ` Jonny Grant
2023-11-09 10:31                 ` Jonny Grant
2023-11-09 11:38                   ` Alejandro Colomar
2023-11-09 12:43                     ` Alejandro Colomar
2023-11-09 12:51                     ` Xi Ruoyao
2023-11-09 14:01                       ` Alejandro Colomar
2023-11-09 18:11                     ` Paul Eggert
2023-11-09 23:48                       ` Alejandro Colomar
2023-11-10  5:36                         ` Paul Eggert
2023-11-10 11:05                           ` Alejandro Colomar
2023-11-10 11:47                             ` Alejandro Colomar
2023-11-10 17:58                             ` Paul Eggert
2023-11-10 18:36                               ` Alejandro Colomar
2023-11-10 20:19                                 ` Alejandro Colomar
2023-11-10 23:44                                   ` Jonny Grant
2023-11-10 19:52                               ` Alejandro Colomar
2023-11-10 22:14                                 ` Paul Eggert
2023-11-11 21:13                                   ` Alejandro Colomar
2023-11-11 22:20                                     ` Paul Eggert
2023-11-12  9:52                                     ` Jonny Grant
2023-11-12 10:59                                       ` Alejandro Colomar
2023-11-12 20:49                                         ` Paul Eggert
2023-11-12 21:00                                           ` Alejandro Colomar
2023-11-12 21:45                                             ` Alejandro Colomar
2023-11-13 23:46                                           ` Jonny Grant
2023-11-17 21:57                                         ` Jonny Grant
2023-11-18 10:12                                           ` Alejandro Colomar
2023-11-18 23:03                                             ` Jonny Grant
2023-11-10 11:36                           ` Jonny Grant
2023-11-10 13:15                             ` Alejandro Colomar
2023-11-18 23:40                               ` Jonny Grant
2023-11-20 11:56                                 ` Jonny Grant
2023-11-20 15:12                                   ` Alejandro Colomar
2023-11-20 23:08                                     ` Jonny Grant
2023-11-20 23:42                                       ` Alejandro Colomar
2023-11-10 11:23                     ` Jonny Grant
2023-11-09 12:23                 ` Alejandro Colomar
2023-11-09 12:35                   ` Alejandro Colomar
2023-11-10  7:06                   ` Oskari Pirhonen
2023-11-10 11:18                     ` Alejandro Colomar
2023-11-11  7:55                       ` Oskari Pirhonen
2023-11-10 16:06                   ` Matthew House
2023-11-10 17:48                     ` Alejandro Colomar
2023-11-13 15:01                       ` Matthew House
2023-11-11 20:55                     ` Jonny Grant
2023-11-11 21:15                       ` Jonny Grant
2023-11-11 22:36                         ` Alejandro Colomar
2023-11-11 23:19                           ` Alejandro Colomar
2023-11-17 21:46                           ` Jonny Grant
2023-11-18  9:37                             ` PDF book of unreleased pages (was: strncpy clarify result may not be null terminated) Alejandro Colomar
2023-11-19  0:22                               ` Deri
2023-11-19  1:19                                 ` Alejandro Colomar
2023-11-19  9:29                                   ` Alejandro Colomar
2023-11-19 16:21                                   ` Deri
2023-11-19 20:58                                     ` Alejandro Colomar
2023-11-20  0:46                                       ` G. Branden Robinson
2023-11-20  9:43                                         ` Alejandro Colomar
2023-11-18  9:44                             ` NULL safety " Alejandro Colomar
2023-11-18 23:21                               ` NULL safety Jonny Grant
2023-11-24 22:25                                 ` Alejandro Colomar
2023-11-25  0:57                                   ` Jonny Grant
2023-11-10 10:40               ` strncpy clarify result may not be null terminated Stefan Puiu
2023-11-10 11:06                 ` Jonny Grant
2023-11-10 11:20                 ` Alejandro Colomar
2023-11-12  9:17 ` [PATCH 0/2] Expand BUGS section of string_copying(7) Alejandro Colomar
2023-11-12  9:18 ` [PATCH 1/2] string_copying.7: BUGS: *cat(3) functions aren't always bad Alejandro Colomar
2023-11-12  9:18 ` [PATCH 2/2] string_copying.7: BUGS: Document strl{cpy,cat}(3)'s performance problems Alejandro Colomar
2023-11-12 11:26 ` [PATCH v2 0/3] Improve string_copying(7) Alejandro Colomar
2023-11-12 11:26 ` [PATCH v2 1/3] string_copying.7: BUGS: *cat(3) functions aren't always bad Alejandro Colomar
2023-11-17 21:43   ` Jonny Grant
2023-11-18  0:25     ` Signing all patches and email to this list Matthew House
2023-11-18 23:24       ` Jonny Grant
2023-11-12 11:26 ` [PATCH v2 2/3] string_copying.7: BUGS: Document strl{cpy,cat}(3)'s performance problems Alejandro Colomar
2023-11-12 11:27 ` [PATCH v2 3/3] strtcpy.3, string_copying.7: Add strtcpy(3) Alejandro Colomar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZUvkPQZiep7k5536@debian \
    --to=alx@kernel.org \
    --cc=jg@jguk.org \
    --cc=linux-man@vger.kernel.org \
    --cc=mattlloydhouse@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox