public inbox for linux-man@vger.kernel.org
 help / color / mirror / Atom feed
From: Alejandro Colomar <alx.manpages@gmail.com>
To: Stefan Puiu <stefan.puiu@gmail.com>
Cc: linux-man@vger.kernel.org, Alejandro Colomar <alx@kernel.org>,
	Martin Sebor <msebor@redhat.com>,
	"G. Branden Robinson" <g.branden.robinson@gmail.com>,
	Douglas McIlroy <douglas.mcilroy@dartmouth.edu>,
	Jakub Wilk <jwilk@jwilk.net>, Serge Hallyn <serge@hallyn.com>,
	Iker Pedrosa <ipedrosa@redhat.com>,
	Andrew Pinski <pinskia@gmail.com>
Subject: Re: [PATCH v6 1/5] string_copy.7: Add page to document all string-copying functions
Date: Tue, 20 Dec 2022 16:03:54 +0100	[thread overview]
Message-ID: <b555606a-ba56-3543-d9dd-debbc89fa3e3@gmail.com> (raw)
In-Reply-To: <CACKs7VC-2j7cK3AYBAx5yxrJTXb1EAarjXhmOBDKcCNgyY1EZA@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 31116 bytes --]

Hi Stefan,

On 12/20/22 16:00, Stefan Puiu wrote:
> Hi,
> 
> Noticed a typo below

Typo fixed.  Thanks,

Alex

<https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=3d395282860f7b86f65c6735351f24b52c486718>

> 
> On Mon, Dec 19, 2022 at 11:02 PM Alejandro Colomar
> <alx.manpages@gmail.com> wrote:
>>
>> This is an opportunity to use consistent language across the
>> documentation for all string-copying functions.
>>
>> It is also easier to show the similarities and differences between all
>> of the functions, so that a reader can use this page to know which
>> function is needed for a given task.
>>
>> Alternative functions not provided by libc have been given in the same
>> page, with reference implementations.
>>
>> Cc: Martin Sebor <msebor@redhat.com>
>> Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
>> Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
>> Cc: Jakub Wilk <jwilk@jwilk.net>
>> Cc: Serge Hallyn <serge@hallyn.com>
>> Cc: Iker Pedrosa <ipedrosa@redhat.com>
>> Cc: Andrew Pinski <pinskia@gmail.com>
>> Cc: Stefan Puiu <stefan.puiu@gmail.com>
>> Signed-off-by: Alejandro Colomar <alx@kernel.org>
>> ---
>>   man7/string_copy.7 | 855 +++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 855 insertions(+)
>>   create mode 100644 man7/string_copy.7
>>
>> diff --git a/man7/string_copy.7 b/man7/string_copy.7
>> new file mode 100644
>> index 000000000..a32b93c01
>> --- /dev/null
>> +++ b/man7/string_copy.7
>> @@ -0,0 +1,855 @@
>> +.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
>> +.\"
>> +.\" SPDX-License-Identifier: BSD-3-Clause
>> +.\"
>> +.TH string_copy 7 (date) "Linux man-pages (unreleased)"
>> +.\" ----- NAME :: -----------------------------------------------------/
>> +.SH NAME
>> +stpcpy,
>> +strcpy, strcat,
>> +stpecpy, stpecpyx,
>> +strlcpy, strlcat,
>> +stpncpy,
>> +strncpy,
>> +zustr2ustp, zustr2stp,
>> +strncat,
>> +ustpcpy, ustr2stp
>> +\- copy strings and character sequences
>> +.\" ----- SYNOPSIS :: -------------------------------------------------/
>> +.SH SYNOPSIS
>> +.\" ----- SYNOPSIS :: (Null-terminated) strings -----------------------/
>> +.SS Strings
>> +.nf
>> +// Chain-copy a string.
>> +.BI "char *stpcpy(char *restrict " dst ", const char *restrict " src );
>> +.PP
>> +// Copy/catenate a string.
>> +.BI "char *strcpy(char *restrict " dst ", const char *restrict " src );
>> +.BI "char *strcat(char *restrict " dst ", const char *restrict " src );
>> +.PP
>> +// Chain-copy a string with truncation.
>> +.BI "char *stpecpy(char *" dst ", char " end "[0], const char *restrict " src );
>> +.PP
>> +// Chain-copy a string with truncation and SIGSEGV on UB.
>> +.BI "char *stpecpyx(char *" dst ", char " end "[0], const char *restrict " src );
>> +.PP
>> +// Copy/catenate a string with truncation and SIGSEGV on UB.
>> +.BI "size_t strlcpy(char " dst "[restrict ." sz "], \
>> +const char *restrict " src ,
>> +.BI "               size_t " sz );
>> +.BI "size_t strlcat(char " dst "[restrict ." sz "], \
>> +const char *restrict " src ,
>> +.BI "               size_t " sz );
>> +.fi
>> +.\" ----- SYNOPSIS :: Null-padded character sequences --------/
>> +.SS Null-padded character sequences
>> +.nf
>> +// Zero a fixed-width buffer, and
>> +// copy a string into a character sequence with truncation.
>> +.BI "char *stpncpy(char " dst "[restrict ." sz "], \
>> +const char *restrict " src ,
>> +.BI "               size_t " sz );
>> +.PP
>> +// Zero a fixed-width buffer, and
>> +// copy a string into a character sequence with truncation.
>> +.BI "char *strncpy(char " dest "[restrict ." sz "], \
>> +const char *restrict " src ,
>> +.BI "               size_t " sz );
>> +.PP
>> +// Chain-copy a null-padded character sequence into a character sequence.
>> +.BI "char *zustr2ustp(char *restrict " dst ", \
>> +const char " src "[restrict ." sz ],
>> +.BI "               size_t " sz );
>> +.PP
>> +// Chain-copy a null-padded character sequence into a string.
>> +.BI "char *zustr2stp(char *restrict " dst ", \
>> +const char " src "[restrict ." sz ],
>> +.BI "               size_t " sz );
>> +.PP
>> +// Catenate a null-padded character sequence into a string.
>> +.BI "char *strncat(char *restrict " dst ", const char " src "[restrict ." sz ],
>> +.BI "               size_t " sz );
>> +.fi
>> +.\" ----- SYNOPSIS :: Measured character sequences --------------------/
>> +.SS Measured character sequences
>> +.nf
>> +// Chain-copy a measured character sequence.
>> +.BI "char *ustpcpy(char *restrict " dst ", \
>> +const char " src "[restrict ." len ],
>> +.BI "               size_t " len );
>> +.PP
>> +// Chain-copy a measured character sequence into a string.
>> +.BI "char *ustr2stp(char *restrict " dst ", \
>> +const char " src "[restrict ." len ],
>> +.BI "               size_t " len );
>> +.fi
>> +.SH DESCRIPTION
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: -----------------/
>> +.SS Terms (and abbreviations)
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: string (str) ----/
>> +.TP
>> +.IR "string " ( str )
>> +is a sequence of zero or more non-null characters followed by a null byte.
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: null-padded character seq
>> +.TP
>> +.I character sequence
>> +is a sequence of zero or more non-null characters.
>> +A program should never usa a character sequence where a string is required.
> 
> Here I think you want s/usa/use above.
> 
> Thanks,
> Stefan.
> 
>> +However, with appropriate care,
>> +a string can be used in the place of a character sequence.
>> +.RS
>> +.TP
>> +.IR "null-padded character sequence " ( zustr )
>> +Character sequences can be contained in fixed-width buffers,
>> +which contain padding null bytes after the character sequence,
>> +to fill the rest of the buffer
>> +without affecting the character sequence;
>> +however, those padding null bytes are not part of the character sequence.
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: measured character sequence
>> +.TP
>> +.IR "measured character sequence " ( ustr )
>> +Character sequence delimited by its length.
>> +It may be a slice of a larger character sequence,
>> +or even of a string.
>> +.RE
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: length (len) ----/
>> +.TP
>> +.IR "length " ( len )
>> +is the number of non-null characters in a string or character sequence.
>> +It is the return value of
>> +.I strlen(str)
>> +and of
>> +.IR "strnlen(ustr, sz)" .
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: size (sz) -------/
>> +.TP
>> +.IR "size " ( sz )
>> +refers to the entire buffer
>> +where the string or character sequence is contained.
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: end -------------/
>> +.TP
>> +.I end
>> +is the name of a pointer to one past the last element of a buffer.
>> +It is equivalent to
>> +.IR &str[sz] .
>> +It is used as a sentinel value,
>> +to be able to truncate strings or character sequences
>> +instead of overrunning the containing buffer.
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: copy ------------/
>> +.TP
>> +.I copy
>> +This term is used when
>> +the writing starts at the first element pointed to by
>> +.IR dst .
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: catenate --------/
>> +.TP
>> +.I catenate
>> +This term is used when
>> +a function first finds the terminating null byte in
>> +.IR dst ,
>> +and then starts writing at that position.
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: chain -----------/
>> +.TP
>> +.I chain
>> +This term is used when
>> +it's the programmer who provides
>> +a pointer to the terminating null byte in the string
>> +.I dst
>> +(or one after the last character in a character sequence),
>> +and the function starts writing at that location.
>> +The function returns
>> +a pointer to the new location of the terminating null byte
>> +(or one after the last character in a character sequence)
>> +after the call,
>> +so that the programmer can use it to chain such calls.
>> +.\" ----- DESCRIPTION :: Copy, catenate, and chain-copy ---------------/
>> +.SS Copy, catenate, and chain-copy
>> +Originally,
>> +there was a distinction between functions that copy and those that catenate.
>> +However, newer functions that copy while allowing chaining
>> +cover both use cases with a single API.
>> +They are also algorithmically faster,
>> +since they don't need to search for
>> +the terminating null byte of the existing string.
>> +However, functions that catenate have a much simpler use,
>> +so if performance is not important,
>> +it can make sense to use them for improving readability.
>> +.PP
>> +The pointer returned by functions that allow chaining
>> +is a byproduct of the copy operation,
>> +so it has no performance costs.
>> +Functions that return such a pointer,
>> +and thus can be chained,
>> +have names of the form
>> +.RB * stp *(),
>> +since it's common to name the pointer just
>> +.IR p .
>> +.PP
>> +Chain-copying functions that truncate
>> +should accept a pointer to the end of the destination buffer,
>> +and have names of the form
>> +.RB * stpe *().
>> +This allows not having to recalculate the remaining size after each call.
>> +.\" ----- DESCRIPTION :: Truncate or not? -----------------------------/
>> +.SS Truncate or not?
>> +The first thing to note is that programmers should be careful with buffers,
>> +so they always have the correct size,
>> +and truncation is not necessary.
>> +.PP
>> +In most cases,
>> +truncation is not desired,
>> +and it is simpler to just do the copy.
>> +Simpler code is safer code.
>> +Programming against programming mistakes by adding more code
>> +just adds more points where mistakes can be made.
>> +.PP
>> +Nowadays,
>> +compilers can detect most programmer errors with features like
>> +compiler warnings,
>> +static analyzers, and
>> +.BR \%_FORTIFY_SOURCE
>> +(see
>> +.BR ftm (7)).
>> +Keeping the code simple
>> +helps these overflow-detection features be more precise.
>> +.PP
>> +When validating user input,
>> +however,
>> +it makes sense to truncate.
>> +Remember to check the return value of such function calls.
>> +.PP
>> +Functions that truncate:
>> +.IP \(bu 3
>> +.BR stpecpy (3)
>> +is the most efficient string copy function that performs truncation.
>> +It only requires to check for truncation once after all chained calls.
>> +.IP \(bu
>> +.BR stpecpyx (3)
>> +is a variant of
>> +.BR stpecpy (3)
>> +that consumes the entire source string,
>> +to catch bugs in the program
>> +by forcing a segmentation fault (as
>> +.BR strlcpy (3bsd)
>> +and
>> +.BR strlcat (3bsd)
>> +do).
>> +.IP \(bu
>> +.BR strlcpy (3bsd)
>> +and
>> +.BR strlcat (3bsd)
>> +are designed to crash if the input string is invalid
>> +(doesn't contain a terminating null byte).
>> +.IP \(bu
>> +.BR stpncpy (3)
>> +and
>> +.BR strncpy (3)
>> +also truncate, but they don't write strings,
>> +but rather null-padded character sequences.
>> +.\" ----- DESCRIPTION :: Null-padded character sequences --------------/
>> +.SS Null-padded character sequences
>> +For historic reasons,
>> +some standard APIs,
>> +such as
>> +.BR utmpx (5),
>> +use null-padded character sequences in fixed-width buffers.
>> +To interface with them,
>> +specialized functions need to be used.
>> +.PP
>> +To copy strings into them, use
>> +.BR stpncpy (3).
>> +.PP
>> +To copy from an unterminated string within a fixed-width buffer into a string,
>> +ignoring any trailing null bytes in the source fixed-width buffer,
>> +you should use
>> +.BR zustr2stp (3)
>> +or
>> +.BR strncat (3).
>> +.PP
>> +To copy from an unterminated string within a fixed-width buffer
>> +into a character sequence,
>> +ingoring any trailing null bytes in the source fixed-width buffer,
>> +you should use
>> +.BR zustr2ustp (3).
>> +.\" ----- DESCRIPTION :: Measured character sequences -----------------/
>> +.SS Measured character sequences
>> +The simplest character sequence copying function is
>> +.BR mempcpy (3).
>> +It requires always knowing the length of your character sequences,
>> +for which structures can be used.
>> +It makes the code much faster,
>> +since you always know the length of your character sequences,
>> +and can do the minimal copies and length measurements.
>> +.BR mempcpy (3)
>> +copies character sequences,
>> +so you need to explicitly set the terminating null byte if you need a string.
>> +.PP
>> +However,
>> +for keeping type safety,
>> +it's good to add a wrapper that uses
>> +.I char\~*
>> +instead of
>> +.IR void\~* :
>> +.BR ustpcpy (3).
>> +.PP
>> +In programs that make considerable use of strings or character sequences,
>> +and need the best performance,
>> +using overlapping character sequences can make a big difference.
>> +It allows holding subsequences of a larger character sequence.
>> +while not duplicating memory
>> +nor using time to do a copy.
>> +.PP
>> +However, this is delicate,
>> +since it requires using character sequences.
>> +C library APIs use strings,
>> +so programs that use character sequences
>> +will have to take care of differentiating strings from character sequences.
>> +.PP
>> +To copy a measured character sequence, use
>> +.BR ustpcpy (3).
>> +.PP
>> +To copy a measured character sequence into a string, use
>> +.BR ustr2stp (3).
>> +.PP
>> +Because these functions ask for the length,
>> +and a string is by nature composed of a character sequence of the same length
>> +plus a terminating null byte,
>> +a string is also accepted as input.
>> +.\" ----- DESCRIPTION :: String vs character sequence -----------------/
>> +.SS String vs character sequence
>> +Some functions only operate on strings.
>> +Those require that the input
>> +.I src
>> +is a string,
>> +and guarantee an output string
>> +(even when truncation occurs).
>> +Functions that catenate
>> +also require that
>> +.I dst
>> +holds a string before the call.
>> +List of functions:
>> +.IP \(bu 3
>> +.PD 0
>> +.BR stpcpy (3)
>> +.IP \(bu
>> +.BR strcpy "(3), \c"
>> +.BR strcat (3)
>> +.IP \(bu
>> +.BR stpecpy "(3), \c"
>> +.BR stpecpyx (3)
>> +.IP \(bu
>> +.BR strlcpy "(3bsd), \c"
>> +.BR strlcat (3bsd)
>> +.PD
>> +.PP
>> +Other functions require an input string,
>> +but create a character sequence as output.
>> +These functions have confusing names,
>> +and have a long history of misuse.
>> +List of functions:
>> +.IP \(bu 3
>> +.PD 0
>> +.BR stpncpy (3)
>> +.IP \(bu
>> +.BR strncpy (3)
>> +.PD
>> +.PP
>> +Other functions operate on an input character sequence,
>> +and create an output string.
>> +Functions that catenate
>> +also require that
>> +.I dst
>> +holds a string before the call.
>> +.BR strncat (3)
>> +has an even more misleading name than the functions above.
>> +List of functions:
>> +.IP \(bu 3
>> +.PD 0
>> +.BR zustr2stp (3)
>> +.IP \(bu
>> +.BR strncat (3)
>> +.IP \(bu
>> +.BR ustr2stp (3)
>> +.PD
>> +.PP
>> +Other functions operate on an input character sequence
>> +to create an output character sequence.
>> +List of functions:
>> +.IP \(bu 3
>> +.PD 0
>> +.BR ustpcpy (3)
>> +.IP \(bu
>> +.BR zustr2stp (3)
>> +.PD
>> +.\" ----- DESCRIPTION :: Functions :: ---------------------------------/
>> +.SS Functions
>> +.\" ----- DESCRIPTION :: Functions :: stpcpy(3) -----------------------/
>> +.TP
>> +.BR stpcpy (3)
>> +This function copies the input string into a destination string.
>> +The programmer is responsible for allocating a buffer large enough.
>> +It returns a pointer suitable for chaining.
>> +.\" ----- DESCRIPTION :: Functions :: strcpy(3), strcat(3) ------------/
>> +.TP
>> +.BR strcpy (3)
>> +.TQ
>> +.BR strcat (3)
>> +These functions copy and catenate the input string into a destination string.
>> +The programmer is responsible for allocating a buffer large enough.
>> +The return value is useless.
>> +.IP
>> +.BR stpcpy (3)
>> +is a faster alternative to these functions.
>> +.\" ----- DESCRIPTION :: Functions :: stpecpy(3), stpecpyx(3) ---------/
>> +.TP
>> +.BR stpecpy (3)
>> +.TQ
>> +.BR stpecpyx (3)
>> +These functions copy the input string into a destination string.
>> +If the destination buffer,
>> +limited by a pointer to its end,
>> +isn't large enough to hold the copy,
>> +the resulting string is truncated
>> +(but it is guaranteed to be null-terminated).
>> +They return a pointer suitable for chaining.
>> +Truncation needs to be detected only once after the last chained call.
>> +.BR stpecpyx (3)
>> +has identical semantics to
>> +.BR stpecpy (3),
>> +except that it forces a SIGSEGV if the
>> +.I src
>> +pointer is not a string.
>> +.IP
>> +These functions are not provided by any library;
>> +See EXAMPLES for a reference implementation.
>> +.\" ----- DESCRIPTION :: Functions :: strlcpy(3bsd), strlcat(3bsd) ----/
>> +.TP
>> +.BR strlcpy (3bsd)
>> +.TQ
>> +.BR strlcat (3bsd)
>> +These functions copy and catenate the input string into a destination string.
>> +If the destination buffer,
>> +limited by its size,
>> +isn't large enough to hold the copy,
>> +the resulting string is truncated
>> +(but it is guaranteed to be null-terminated).
>> +They return the length of the total string they tried to create.
>> +These functions force a SIGSEGV if the
>> +.I src
>> +pointer is not a string.
>> +.IP
>> +.BR stpecpyx (3)
>> +is a faster alternative to these functions.
>> +.\" ----- DESCRIPTION :: Functions :: stpncpy(3) ----------------------/
>> +.TP
>> +.BR stpncpy (3)
>> +This function copies the input string into
>> +a destination null-padded character sequence in a fixed-width buffer.
>> +If the destination buffer,
>> +limited by its size,
>> +isn't large enough to hold the copy,
>> +the resulting character sequence is truncated.
>> +Since it creates a character sequence,
>> +it doesn't need to write a terminating null byte.
>> +It's impossible to distinguish truncation by the result of the call,
>> +from a character sequence that just fits the destination buffer;
>> +truncation should be detected by
>> +comparing the length of the input string
>> +with the size of the destination buffer.
>> +.\" ----- DESCRIPTION :: Functions :: strncpy(3) ----------------------/
>> +.TP
>> +.BR strncpy (3)
>> +This function is identical to
>> +.BR stpncpy (3)
>> +except for the useless return value.
>> +.IP
>> +.BR stpncpy (3)
>> +is a more useful alternative to this function.
>> +.\" ----- DESCRIPTION :: Functions :: zustr2ustp(3) --------------------/
>> +.TP
>> +.BR zustr2ustp (3)
>> +This function copies the input character sequence
>> +contained in a null-padded wixed-width buffer,
>> +into a destination character sequence.
>> +The programmer is responsible for allocating a buffer large enough.
>> +It returns a pointer suitable for chaining.
>> +.IP
>> +A truncating version of this function doesn't exist,
>> +since the size of the original character sequence is always known,
>> +so it wouldn't be very useful.
>> +.IP
>> +This function is not provided by any library;
>> +See EXAMPLES for a reference implementation.
>> +.\" ----- DESCRIPTION :: Functions :: zustr2stp(3) --------------------/
>> +.TP
>> +.BR zustr2stp (3)
>> +This function copies the input character sequence
>> +contained in a null-padded wixed-width buffer,
>> +into a destination string.
>> +The programmer is responsible for allocating a buffer large enough.
>> +It returns a pointer suitable for chaining.
>> +.IP
>> +A truncating version of this function doesn't exist,
>> +since the size of the original character sequence is always known,
>> +so it wouldn't be very useful.
>> +.IP
>> +This function is not provided by any library;
>> +See EXAMPLES for a reference implementation.
>> +.\" ----- DESCRIPTION :: Functions :: strncat(3) ----------------------/
>> +.TP
>> +.BR strncat (3)
>> +Do not confuse this function with
>> +.BR strncpy (3);
>> +they are not related at all.
>> +.IP
>> +This function catenates the input character sequence
>> +contained in a null-padded wixed-width buffer,
>> +into a destination string.
>> +The programmer is responsible for allocating a buffer large enough.
>> +The return value is useless.
>> +.IP
>> +.BR zustr2stp (3)
>> +is a faster alternative to this function.
>> +.\" ----- DESCRIPTION :: Functions :: ustpcpy(3) ----------------------/
>> +.TP
>> +.BR ustpcpy (3)
>> +This function copies the input character sequence,
>> +limited by its length,
>> +into a destination character sequence.
>> +The programmer is responsible for allocating a buffer large enough.
>> +It returns a pointer suitable for chaining.
>> +.\" ----- DESCRIPTION :: Functions :: ustr2stp(3) ---------------------/
>> +.TP
>> +.BR ustr2stp (3)
>> +This function copies the input character sequence,
>> +limited by its length,
>> +into a destination string.
>> +The programmer is responsible for allocating a buffer large enough.
>> +It returns a pointer suitable for chaining.
>> +.\" ----- RETURN VALUE :: ---------------------------------------------/
>> +.SH RETURN VALUE
>> +The following functions return
>> +a pointer to the terminating null byte in the destination string.
>> +.IP \(bu 3
>> +.PD 0
>> +.BR stpcpy (3)
>> +.IP \(bu
>> +.BR ustr2stp (3)
>> +.IP \(bu
>> +.BR zustr2stp (3)
>> +.PD
>> +.PP
>> +The following functions return
>> +a pointer to the terminating null byte in the destination string,
>> +except when truncation occurs;
>> +if truncation occurs,
>> +they return a pointer to the end of the destination buffer.
>> +.IP \(bu 3
>> +.BR stpecpy (3),
>> +.BR stpecpyx (3)
>> +.PP
>> +The following function returns
>> +a pointer to one after the last character
>> +in the destination character sequence;
>> +if truncation occurs,
>> +that pointer is equivalent to
>> +a pointer to the end of the destination buffer.
>> +.IP \(bu 3
>> +.BR stpncpy (3)
>> +.PP
>> +The following functions return
>> +a pointer to one after the last character
>> +in the destination character sequence.
>> +.IP \(bu 3
>> +.PD 0
>> +.BR zustr2ustp (3)
>> +.IP \(bu
>> +.BR ustpcpy (3)
>> +.PD
>> +.PP
>> +The following functions return
>> +the length of the total string that they tried to create
>> +(as if truncation didn't occur).
>> +.IP \(bu 3
>> +.BR strlcpy (3bsd),
>> +.BR strlcat (3bsd)
>> +.PP
>> +The following functions return the
>> +.I dst
>> +pointer,
>> +which is useless.
>> +.IP \(bu 3
>> +.PD 0
>> +.BR strcpy (3),
>> +.BR strcat (3)
>> +.IP \(bu
>> +.BR strncpy (3)
>> +.IP \(bu
>> +.BR strncat (3)
>> +.PD
>> +.\" ----- NOTES :: strscpy(9) -----------------------------------------/
>> +.SH NOTES
>> +The Linux kernel has an internal function for copying strings,
>> +which is similar to
>> +.BR stpecpy (3),
>> +except that it can't be chained:
>> +.TP
>> +.BR strscpy (9)
>> +This function copies the input string into a destination string.
>> +If the destination buffer,
>> +limited by its size,
>> +isn't large enough to hold the copy,
>> +the resulting string is truncated
>> +(but it is guaranteed to be null-terminated).
>> +It returns the length of the destination string, or
>> +.B \-E2BIG
>> +on truncation.
>> +.IP
>> +.BR stpecpy (3)
>> +is a simpler and faster alternative to this function.
>> +.RE
>> +.\" ----- CAVEATS :: --------------------------------------------------/
>> +.SH CAVEATS
>> +Don't mix chain calls to truncating and non-truncating functions.
>> +It is conceptually wrong
>> +unless you know that the first part of a copy will always fit.
>> +Anyway, the performance difference will probably be negligible,
>> +so it will probably be more clear if you use consistent semantics:
>> +either truncating or non-truncating.
>> +Calling a non-truncating function after a truncating one is necessarily wrong.
>> +.\" ----- BUGS :: -----------------------------------------------------/
>> +.SH BUGS
>> +All catenation functions share the same performance problem:
>> +.UR https://www.joelonsoftware.com/\:2001/12/11/\:back\-to\-basics/
>> +Shlemiel the painter
>> +.UE .
>> +.\" ----- EXAMPLES :: -------------------------------------------------/
>> +.SH EXAMPLES
>> +The following are examples of correct use of each of these functions.
>> +.\" ----- EXAMPLES :: stpcpy(3) ---------------------------------------/
>> +.TP
>> +.BR stpcpy (3)
>> +.EX
>> +p = buf;
>> +p = stpcpy(p, "Hello ");
>> +p = stpcpy(p, "world");
>> +p = stpcpy(p, "!");
>> +len = p \- buf;
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: strcpy(3), strcat(3) ----------------------------/
>> +.TP
>> +.BR strcpy (3)
>> +.TQ
>> +.BR strcat (3)
>> +.EX
>> +strcpy(buf, "Hello ");
>> +strcat(buf, "world");
>> +strcat(buf, "!");
>> +len = strlen(buf);
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: stpecpy(3), stpecpyx(3) -------------------------/
>> +.TP
>> +.BR stpecpy (3)
>> +.TQ
>> +.BR stpecpyx (3)
>> +.EX
>> +end = buf + sizeof(buf);
>> +p = buf;
>> +p = stpecpy(p, end, "Hello ");
>> +p = stpecpy(p, end, "world");
>> +p = stpecpy(p, end, "!");
>> +if (p == end) {
>> +    p\-\-;
>> +    goto toolong;
>> +}
>> +len = p \- buf;
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: strlcpy(3bsd), strlcat(3bsd) --------------------/
>> +.TP
>> +.BR strlcpy (3bsd)
>> +.TQ
>> +.BR strlcat (3bsd)
>> +.EX
>> +if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
>> +    goto toolong;
>> +if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
>> +    goto toolong;
>> +len = strlcat(buf, "!", sizeof(buf));
>> +if (len >= sizeof(buf))
>> +    goto toolong;
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: strscpy(9) --------------------------------------/
>> +.TP
>> +.BR strscpy (9)
>> +.EX
>> +len = strscpy(buf, "Hello world!", sizeof(buf));
>> +if (len == \-E2BIG)
>> +    goto toolong;
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: stpncpy(3) --------------------------------------/
>> +.TP
>> +.BR stpncpy (3)
>> +.EX
>> +p = stpncpy(buf, "Hello world!", sizeof(buf));
>> +if (sizeof(buf) < strlen("Hello world!"))
>> +    goto toolong;
>> +len = p \- buf;
>> +for (size_t i = 0; i < sizeof(buf); i++)
>> +    putchar(buf[i]);
>> +.EE
>> +.\" ----- EXAMPLES :: strncpy(3) --------------------------------------/
>> +.TP
>> +.BR strncpy (3)
>> +.EX
>> +strncpy(buf, "Hello world!", sizeof(buf));
>> +if (sizeof(buf) < strlen("Hello world!"))
>> +    goto toolong;
>> +len = strnlen(buf, sizeof(buf));
>> +for (size_t i = 0; i < sizeof(buf); i++)
>> +    putchar(buf[i]);
>> +.EE
>> +.\" ----- EXAMPLES :: zustr2ustp(3) -----------------------------------/
>> +.TP
>> +.BR zustr2ustp (3)
>> +.EX
>> +p = buf;
>> +p = zustr2ustp(p, "Hello ", 6);
>> +p = zustr2ustp(p, "world", 42);  // Padding null bytes ignored.
>> +p = zustr2ustp(p, "!", 1);
>> +len = p \- buf;
>> +printf("%.*s\en", (int) len, buf);
>> +.EE
>> +.\" ----- EXAMPLES :: zustr2stp(3) ------------------------------------/
>> +.TP
>> +.BR zustr2stp (3)
>> +.EX
>> +p = buf;
>> +p = zustr2stp(p, "Hello ", 6);
>> +p = zustr2stp(p, "world", 42);  // Padding null bytes ignored.
>> +p = zustr2stp(p, "!", 1);
>> +len = p \- buf;
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: strncat(3) --------------------------------------/
>> +.TP
>> +.BR strncat (3)
>> +.EX
>> +buf[0] = \(aq\e0\(aq;  // There's no 'cpy' function to this 'cat'.
>> +strncat(buf, "Hello ", 6);
>> +strncat(buf, "world", 42);  // Padding null bytes ignored.
>> +strncat(buf, "!", 1);
>> +len = strlen(buf);
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: ustpcpy(3) --------------------------------------/
>> +.TP
>> +.BR ustpcpy (3)
>> +.EX
>> +p = buf;
>> +p = ustpcpy(p, "Hello ", 6);
>> +p = ustpcpy(p, "world", 5);
>> +p = ustpcpy(p, "!", 1);
>> +len = p \- buf;
>> +printf("%.*s\en", (int) len, buf);
>> +.EE
>> +.\" ----- EXAMPLES :: ustr2stp(3) -------------------------------------/
>> +.TP
>> +.BR ustr2stp (3)
>> +.EX
>> +p = buf;
>> +p = ustr2stp(p, "Hello ", 6);
>> +p = ustr2stp(p, "world", 5);
>> +p = ustr2stp(p, "!", 1);
>> +len = p \- buf;
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: Implementations :: ------------------------------/
>> +.SS Implementations
>> +Here are reference implementations for functions not provided by libc.
>> +.PP
>> +.in +4n
>> +.EX
>> +/* This code is in the public domain. */
>> +
>> +.\" ----- EXAMPLES :: Implementations :: stpecpy(3) -------------------/
>> +char *
>> +.IR stpecpy "(char *dst, char end[0], const char *restrict src)"
>> +{
>> +    char *p;
>> +
>> +    if (dst == end)
>> +        return end;
>> +
>> +    p = memccpy(dst, src, \(aq\e0\(aq, end \- dst);
>> +    if (p != NULL)
>> +        return p \- 1;
>> +
>> +    /* truncation detected */
>> +    end[\-1] = \(aq\e0\(aq;
>> +    return end;
>> +}
>> +
>> +.\" ----- EXAMPLES :: Implementations :: stpecpy(3) -------------------/
>> +char *
>> +.IR stpecpyx "(char *dst, char end[0], const char *restrict src)"
>> +{
>> +    if (src[strlen(src)] != \(aq\e0\(aq)
>> +        raise(SIGSEGV);
>> +
>> +    return stpecpy(dst, end, src);
>> +}
>> +
>> +.\" ----- EXAMPLES :: Implementations :: zustr2ustp(3) ----------------/
>> +char *
>> +.IR zustr2ustp "(char *restrict dst, const char *restrict src, size_t sz)"
>> +{
>> +    return ustpcpy(dst, src, strnlen(src, sz));
>> +}
>> +
>> +.\" ----- EXAMPLES :: Implementations :: zustr2stp(3) -----------------/
>> +char *
>> +.IR zustr2stp "(char *restrict dst, const char *restrict src, size_t sz)"
>> +{
>> +    char  *p;
>> +
>> +    p = zustr2ustp(dst, src, sz);
>> +    *p = \(aq\e0\(aq;
>> +
>> +    return p;
>> +}
>> +
>> +.\" ----- EXAMPLES :: Implementations :: ustpcpy(3) -------------------/
>> +char *
>> +.IR ustpcpy "(char *restrict dst, const char *restrict src, size_t len)"
>> +{
>> +    return mempcpy(dst, src, len);
>> +}
>> +
>> +.\" ----- EXAMPLES :: Implementations :: ustr2stp(3) ------------------/
>> +char *
>> +.IR ustr2stp "(char *restrict dst, const char *restrict src, size_t len)"
>> +{
>> +    char  *p;
>> +
>> +    p = ustpcpy(dst, src, len);
>> +    *p = \(aq\e0\(aq;
>> +
>> +    return p;
>> +}
>> +.EE
>> +.in
>> +.\" ----- SEE ALSO :: -------------------------------------------------/
>> +.SH SEE ALSO
>> +.BR bzero (3),
>> +.BR memcpy (3),
>> +.BR memccpy (3),
>> +.BR mempcpy (3),
>> +.BR stpcpy (3),
>> +.BR strlcpy (3bsd),
>> +.BR strncat (3),
>> +.BR stpncpy (3),
>> +.BR string (3)
>> --
>> 2.39.0
>>

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2022-12-20 15:04 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-11 23:59 string_copy(7): New manual page documenting string copying functions Alejandro Colomar
2022-12-12  0:17 ` Alejandro Colomar
2022-12-12  0:25 ` Alejandro Colomar
2022-12-12  0:32 ` Alejandro Colomar
2022-12-12 14:24 ` [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions Alejandro Colomar
2022-12-12 17:33   ` Alejandro Colomar
2022-12-12 18:38     ` groff man(7) extensions (was: [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions) G. Branden Robinson
2022-12-13 15:45       ` a Q quotation macro for man(7) (was: groff man(7) extensions) G. Branden Robinson
2022-12-12 23:00   ` [PATCH v2 0/3] Rewrite strcpy(3) Alejandro Colomar
2022-12-13 20:56     ` Jakub Wilk
2022-12-13 20:57       ` Alejandro Colomar
2022-12-13 22:05       ` Alejandro Colomar
2022-12-13 22:46         ` Alejandro Colomar
2022-12-14  0:03     ` [PATCH v3 0/1] Rewritten page for string-copying functions Alejandro Colomar
2022-12-14  0:14       ` Alejandro Colomar
2022-12-14  0:16         ` Alejandro Colomar
2022-12-14 16:17       ` [PATCH v4 " Alejandro Colomar
2022-12-15  0:26         ` [PATCH v5 0/5] Rewrite pages about " Alejandro Colomar
2022-12-19 21:02           ` [PATCH v6 0/5] Rewrite documentation for " Alejandro Colomar
2022-12-19 21:02           ` [PATCH v6 1/5] string_copy.7: Add page to document all " Alejandro Colomar
2022-12-20 15:00             ` Stefan Puiu
2022-12-20 15:03               ` Alejandro Colomar [this message]
2023-01-20  3:43             ` Eric Biggers
2023-01-20 12:55               ` Alejandro Colomar
2022-12-19 21:02           ` [PATCH v6 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7) Alejandro Colomar
2022-12-19 21:02           ` [PATCH v6 3/5] stpcpy.3, strcpy.3, strcat.3: Document in a single page Alejandro Colomar
2022-12-19 21:02           ` [PATCH v6 4/5] stpncpy.3, strncpy.3: " Alejandro Colomar
2022-12-19 21:02           ` [PATCH v6 5/5] strncat.3: Rewrite to be consistent with string_copy.7 Alejandro Colomar
2022-12-15  0:26         ` [PATCH v5 1/5] string_copy.7: Add page to document all string-copying functions Alejandro Colomar
2022-12-15  0:30           ` Alejandro Colomar
2022-12-15  0:26         ` [PATCH v5 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7) Alejandro Colomar
2022-12-15  0:27           ` Alejandro Colomar
2022-12-16 18:47             ` Stefan Puiu
2022-12-16 19:03               ` Alejandro Colomar
2022-12-16 19:09                 ` Alejandro Colomar
2022-12-15  0:26         ` [PATCH v5 3/5] stpcpy.3, strcpy.3, strcat.3: Document in a single page Alejandro Colomar
2022-12-16 14:46           ` Alejandro Colomar
2022-12-16 14:47             ` Alejandro Colomar
2022-12-15  0:26         ` [PATCH v5 4/5] stpncpy.3, strncpy.3: " Alejandro Colomar
2022-12-15  0:28           ` Alejandro Colomar
2022-12-15  0:26         ` [PATCH v5 5/5] strncat.3: Rewrite to be consistent with string_copy.7 Alejandro Colomar
2022-12-15  0:29           ` Alejandro Colomar
2022-12-14 16:17       ` [PATCH v4 1/1] strcpy.3: Rewrite page to document all string-copying functions Alejandro Colomar
2022-12-14  0:03     ` [PATCH v3 " Alejandro Colomar
2022-12-14 16:22       ` Douglas McIlroy
2022-12-14 16:36         ` Alejandro Colomar
2022-12-14 17:11           ` Alejandro Colomar
2022-12-14 17:19             ` Alejandro Colomar
2022-12-12 23:00   ` [PATCH v2 1/3] " Alejandro Colomar
2022-12-12 23:00   ` [PATCH v2 2/3] stpcpy.3, stpncpy.3, strcat.3, strncat.3, strncpy.3: Transform the old pages into links to strcpy(3) Alejandro Colomar
2022-12-12 23:00   ` [PATCH v2 3/3] stpecpy.3, stpecpyx.3, strlcat.3, strlcpy.3, strscpy.3: Add new " Alejandro Colomar
2022-12-12 14:24 ` [PATCH 2/3] stpcpy.3, stpncpy.3, strcat.3, strncat.3, strncpy.3: Transform the old pages into " Alejandro Colomar
2022-12-12 14:24 ` [PATCH 3/3] stpecpy.3, stpecpyx.3, strlcat.3, strlcpy.3, strscpy.3: Add new " Alejandro Colomar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b555606a-ba56-3543-d9dd-debbc89fa3e3@gmail.com \
    --to=alx.manpages@gmail.com \
    --cc=alx@kernel.org \
    --cc=douglas.mcilroy@dartmouth.edu \
    --cc=g.branden.robinson@gmail.com \
    --cc=ipedrosa@redhat.com \
    --cc=jwilk@jwilk.net \
    --cc=linux-man@vger.kernel.org \
    --cc=msebor@redhat.com \
    --cc=pinskia@gmail.com \
    --cc=serge@hallyn.com \
    --cc=stefan.puiu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox