public inbox for linux-man@vger.kernel.org
 help / color / mirror / Atom feed
From: Alejandro Colomar <alx@kernel.org>
To: Paul Eggert <eggert@cs.ucla.edu>
Cc: linux-man@vger.kernel.org
Subject: Re: [PATCH 4/4] stpncpy(3) fixes
Date: Mon, 13 Nov 2023 02:29:38 +0100	[thread overview]
Message-ID: <ZVF8B-guyK2Zby4P@debian> (raw)
In-Reply-To: <20231112235218.80195-5-eggert@cs.ucla.edu>

[-- Attachment #1: Type: text/plain, Size: 7680 bytes --]

Hi Paul,

On Sun, Nov 12, 2023 at 03:52:08PM -0800, Paul Eggert wrote:
> Don't say "width" when "size" was meant.
Ok

> Prefer "byte" to the confusing word "character".
Ok

> Don't say that the source is a string; it need not be a string.
As said in string.3, I'm not convinced by the new wording either.

> Don't imply the result always has some null padding.
Ok

> Prefer standalone terminology.
Ok

> Fix indenting of prototype.
Ok

> Improve sample implementation by using memset rather than
> the less-standard bzero,

While bzero(3) is non-standard, it is simpler, so I prefer it.

> and by not overwriting part of
> the destination more than once which is confusing.
Ok

> Simplify example without losing its lessons.
See some comments inline.

> Use fwrite instead of printf to avoid assuming buffer length fits in int;
Thanks!  And even without the int issue, it's more consistent with
handling non-terminated bytes.

> although obviously this buffer length does fit, it's better if the sample
> code is general.
Yep.

> ---
>  man3/stpncpy.3 | 82 ++++++++++++++++++++++++--------------------------
>  1 file changed, 40 insertions(+), 42 deletions(-)
> 
> diff --git a/man3/stpncpy.3 b/man3/stpncpy.3
> index 3cf4eb371..afe230307 100644
> --- a/man3/stpncpy.3
> +++ b/man3/stpncpy.3
> @@ -7,8 +7,8 @@
>  .SH NAME
>  stpncpy, strncpy
>  \-
> -fill a fixed-width buffer with characters from a string
> -and pad with null bytes
> +fill a fixed-size buffer with non-null bytes from a source array,
> +padding with null bytes as needed
>  .SH LIBRARY
>  Standard C library
>  .RI ( libc ", " \-lc )
> @@ -18,10 +18,10 @@ Standard C library
>  .P
>  .BI "char *strncpy(char " dst "[restrict ." sz "], \
>  const char *restrict " src ,
> -.BI "               size_t " sz );
> +.BI "              size_t " sz );
>  .BI "char *stpncpy(char " dst "[restrict ." sz "], \
>  const char *restrict " src ,
> -.BI "               size_t " sz );
> +.BI "              size_t " sz );
>  .fi
>  .P
>  .RS -4
> @@ -37,15 +37,18 @@ Feature Test Macro Requirements for glibc (see
>          _GNU_SOURCE
>  .fi
>  .SH DESCRIPTION
> -These functions copy characters from the string pointed to by
> +These functions copy non-null bytes from the array pointed to by
>  .I src
> -into a character sequence at the fixed-width buffer pointed to by
> -.IR dst ,
> -and pad with null bytes.
> -If the destination buffer,
> -limited by its size,
> -isn't large enough to hold the copy,
> -the resulting character sequence is truncated.
> +into the array that is pointed to by
> +.I dst
> +and that contains
> +.I sz
> +bytes.
> +If the source has too few non-null bytes to fill the destination,
> +the functions pad the destination with trailing null bytes;
> +if it has too many non-null bytes, the functions copy only the first
> +.I sz
> +bytes and do not append any null by5tes.
>  For the difference between the two functions, see RETURN VALUE.
>  .P
>  An implementation of these functions might be:
> @@ -62,8 +65,8 @@ strncpy(char *restrict dst, const char *restrict src, size_t sz)
>  char *
>  stpncpy(char *restrict dst, const char *restrict src, size_t sz)
>  {
> -    bzero(dst, sz);
> -    return mempcpy(dst, src, strnlen(src, sz));
> +    size_t n = strnlen(src, sz);
> +    return memset(mempcpy(dst, src, n), 0, sz - n);

Hmm, I hadn't thought of this chaining.  Nice!

>  }
>  .EE
>  .in
> @@ -75,7 +78,7 @@ returns
>  .TP
>  .BR stpncpy ()
>  returns a pointer to
> -one after the last character in the destination character sequence.
> +one after the last byte in the destination byte sequence.
>  .SH ATTRIBUTES
>  For an explanation of the terms used in this section, see
>  .BR attributes (7).
> @@ -107,9 +110,10 @@ C89, POSIX.1-2001, SVr4, 4.3BSD.
>  glibc 1.07.
>  POSIX.1-2008.
>  .SH CAVEATS
> -The name of these functions is confusing.
> -These functions produce a null-padded character sequence,
> -not a string (see
> +The names of these functions are confusing.
> +Because these functions append null bytes only if the source is a
> +string with length less than the destination size,
> +they might not create a string in their destination (see
>  .BR string_copying (7)).
>  For example:
>  .P
> @@ -122,14 +126,12 @@ strncpy(buf, "123456", 5);  // { \[aq]1\[aq], \[aq]2\[aq], \[aq]3\[aq], \[aq]4\[
>  .EE
>  .in
>  .P
> -It's impossible to distinguish truncation by the result of the call,
> -from a character sequence that just fits the destination buffer;
> -truncation should be detected by
> -comparing the length of the input string
> -with the size of the destination buffer.
> +Although these functions can be used with strings,
> +it is the caller's responsibility to detect whether they produce a string,
> +e.g., by checking whether the result buffer ends in a null byte.
>  .P
> -If you're going to use this function in chained calls,
> -it would be useful to develop a similar function that accepts
> +To use these functions in chained calls,
> +it might be useful to develop wrapper functions that accept
>  a pointer to the end (one after the last element) of the destination buffer
>  instead of its size.
>  .SH EXAMPLES
> @@ -141,30 +143,26 @@ instead of its size.
>  #include <string.h>
>  \&
>  int
> -main(void)
> +main(int argc, char **argv)
>  {
> -    char    *p;
> -    char    buf1[20];
> -    char    buf2[20];
> -    size_t  len;
> +    char buf[20];
>  \&
> -    if (sizeof(buf2) < strlen("Hello world!"))
> -        warnx("strncpy: truncating character sequence");
> -    strncpy(buf2, "Hello world!", sizeof(buf2));
> -    len = strnlen(buf2, sizeof(buf2));
> +    if (strncpy(buf, argv[0], sizeof buf)[sizeof buf - 1])
> +        warnx("strncpy: destination buffer is not a string");

The previous warning was slightly different.  It was warning because
data was lost.  Now you warn when it's not a string, even if all the
data is there.  Since we're later using the data with fwrite(3), I think
it makes more sense to warn iff data has been truncated, ignoring the
fact that it's not a string (the whole page is saying this doesn't
produce a string).

So, I think this change is a bug.

I understand you may want to show how these functions can be used to
produce a string.  Maybe having another example (not necessarily a whole
program) would be useful.  If so, please put that one below this one, to
give the intention that it's only a secondary use of this function, and
not intended for everyone.

> +    size_t len = strnlen(buf, sizeof buf);

C89 declarations, please.

>  \&
>      printf("[len = %zu]: ", len);
> -    printf("%.*s\en", (int) len, buf2);  // "Hello world!"
> +    fwrite(buf, 1, len, stdout);

I like fwrite() better.  Good!  I wasn't happy with this weird printf().

Cheers,
Alex

> +    putchar(\[aq]\en\[aq]);
>  \&
> -    if (sizeof(buf1) < strlen("Hello world!"))
> -        warnx("stpncpy: truncating character sequence");
> -    p = stpncpy(buf1, "Hello world!", sizeof(buf1));
> -    len = p \- buf1;
> +    char *p = stpncpy(buf, argv[0], sizeof buf);
> +    if (p == buf + sizeof buf)
> +        warnx("stpncpy: destination buffer is not a string");
> +    len = p \- buf;
>  \&
>      printf("[len = %zu]: ", len);
> -    printf("%.*s\en", (int) len, buf1);  // "Hello world!"
> -\&
> -    exit(EXIT_SUCCESS);
> +    fwrite(buf, 1, len, stdout);
> +    putchar(\[aq]\en\[aq]);
>  }
>  .EE
>  .\" SRC END
> -- 
> 2.41.0
> 
> 

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-11-13  1:29 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-12 23:52 [PATCH 0/4] improvements for strncpy.3 etc Paul Eggert
2023-11-12 23:52 ` [PATCH 1/4] * Remove man3/stpecpyx.3; no longer present Paul Eggert
2023-11-13  0:18   ` Alejandro Colomar
2023-11-12 23:52 ` [PATCH 2/4] string.3 fixes Paul Eggert
2023-11-13  0:53   ` Alejandro Colomar
2023-11-13 22:27     ` Paul Eggert
2023-11-13 23:25       ` Alejandro Colomar
2023-11-12 23:52 ` [PATCH 3/4] strncat.3 fixes Paul Eggert
2023-11-13  1:15   ` Alejandro Colomar
2023-11-13 16:23     ` Alejandro Colomar
2023-11-21 16:03     ` Alejandro Colomar
2023-11-12 23:52 ` [PATCH 4/4] stpncpy(3) fixes Paul Eggert
2023-11-13  1:29   ` Alejandro Colomar [this message]
2023-11-21 18:33     ` Alejandro Colomar
2023-11-21 21:18       ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZVF8B-guyK2Zby4P@debian \
    --to=alx@kernel.org \
    --cc=eggert@cs.ucla.edu \
    --cc=linux-man@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox