From: "Alejandro Colomar (man-pages)" <alx.manpages@gmail.com>
To: Jonny Grant <jg@jguk.org>
Cc: linux-man <linux-man@vger.kernel.org>,
Michael Kerrisk <mtk.manpages@gmail.com>,
Florian Weimer <fw@deneb.enyo.de>,
gcc-help@gcc.gnu.org,
Segher Boessenkool <segher@kernel.crashing.org>,
Xi Ruoyao <xry111@mengyan1223.wang>
Subject: Re: strlen
Date: Fri, 9 Jul 2021 13:27:01 +0200 [thread overview]
Message-ID: <0ea47fb3-9acc-3517-9593-debcdbce2dd4@gmail.com> (raw)
In-Reply-To: <9d1681ef-1d97-2d08-9abe-dc63d817ca8a@jguk.org>
Hi Jonny,
On 7/9/21 12:50 PM, Jonny Grant wrote:
>
>
> On 08/07/2021 12:06, Alejandro Colomar (man-pages) wrote:
>> On 7/8/21 12:07 PM, Jonny Grant wrote:
>>> Thank you for your reply.
>>>
>>> We can't guarantee safestrlen() won't be called with NULL. So because strlen() itself doesn't check for NULL in C standard we'd need to call the wrapper so that NULL can be checked for.
>>>
>>> I'd like to avoid the compiler removing certain execution paths.
>>> I'd rather keep all code paths, even if they are not taken, just in case a NULL pointer creeps in due to an external device that is connected to an embedded system.
>>>
>>>
>>> Probably this would work:
>>>
>>> size_t __attribute__((optimize("O0"))) safestrlen(const char * s)
>>> {
>>> if (NULL == s) return 0;
>>> else return strlen(s);
>>> }
>>
>> I don't think you don't need that. Unless there's a bug in GCC, it shouldn't optimize that path unless it is 100% sure that it will never be called.
>
> That is good, so the code will always be kept! As compiler will never find all calls to strlen() and be sure those calls are never NULL.
Not always. If you inline that function, that path may be removed in
some calls, if the compiler knows better than you that it can. My point
is that you shouldn't care; your code is completely legal, and whatever
the compiler decides to do will also be legal (no undefined behavior,
and no crashes). If it optimizes, it will be a good thing that you
shouldn't prevent.
If the compiler does otherwise, that's a bug in the compiler, and
something you can't solve by writing different code or preventing
optimizations, or at least there's no guarantee about it, since it's a
bug. But I don't believe you'll find a bug in this case. So please,
trust the compiler, at least when using perfectly defined behavior. And
if you don't trust the compiler, which is perfectly reasonable, test it,
but don't try to workaround bugs that don't exist.
>
>> Moreover, I recommend you to optimize as much as possible.
>> Even though NULL is possible in your code, I guess it's unlikely.
>>
>> Also, calling a function safe is too generic.
>> I'd call it with the suffix null, as it act different on null.
>>
>> Also, I recommend avoiding 'size_t' (and any other unsigned types, BTW).
>> See <https://google.github.io/styleguide/cppguide.html#Integer_Types>.
>> Use the POSIX type 'ssize_t'.
>> That also allows differentiating a length of 0 (i.e., "") from an invalid string (i.e., NULL), by returning -1 for NULL.
>>
>
> https://man7.org/linux/man-pages/man3/strlen.3.html
> size_t strlen(const char *s);
>
> I'd rather not change the return type from POSIX size_t in any wrapper of strlen. Unless it is part of C11 Annex K style standards improvement.
That's a historical accident.
A long time ago (much before I was born, and much before the first
standard, I mean in the early times of K&R C and Unix), unsigned types
were used more than they should, and the first C standards (I mean
ANSI/ISO standards (i.e., C89, C99, ...), not POSIX), with a lot of
already existing code, didn't attempt to change the language, but to
annotate common usage.
In POSIX.1 there's a mix, because POSIX has the type 'ssize_t', which is
not defined by the C standards. POSIX in general tends to use the
signed type 'ssize_t' for its POSIX-only functions (i.e., not in the C
standards).
Annex K has been an attempt of Microsoft to provide safer functions, but
while there are some functions there that have good intentions, most of
them are just badly designed. That annex K is DOA, and will probably be
marked as deprecated in C22 (currently C2x).
I think that a standard should not try to design new functions, and
instead just annotate common usage, as they did in the first ones.
Problems like the ones Annex K suffers could have been detected early if
they had been implemented as an extension to some compiler(s) decade(s)
before being standardized. Therefore, if the implementation passes the
test of time, you standardize it, else not, IMO. Otherwise, we have a
standard that is declared deprecated in the next version of the
standard, similar to what is happening with the C++ standards (which,
guess what, BTW I recently read that they are undeprecating a lot of C
stuff they deprecated in the first standards).
Using unsigned for anything else than bitfileds and similar stuff is
just *wrong*, as you can read from the Google C++ style guide I linked
before.
Another source you can read is this paper from Bjarne Stroustrup:
<http://open-std.org/JTC1/SC22/WG21/docs/papers/2019/p1428r0.pdf>
This is one of the few cases where I agree with something coming from
him. I hope some day we get a ssizeof operator :-)
Cheers,
Alex
--
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/
next prev parent reply other threads:[~2021-07-09 11:27 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-04 18:01 strlen Jonny Grant
2020-09-04 19:21 ` strlen Florian Weimer
2020-09-04 23:14 ` strlen Jonny Grant
2020-09-05 7:12 ` strlen Florian Weimer
2021-07-06 20:30 ` strlen Jonny Grant
2021-07-06 22:11 ` strlen Florian Weimer
2021-07-07 11:36 ` strlen Jonny Grant
2021-07-07 12:22 ` strlen Alejandro Colomar (man-pages)
2021-07-07 12:31 ` strlen Alejandro Colomar (man-pages)
2021-07-07 13:31 ` strlen Jonny Grant
2021-07-07 16:57 ` strlen Alejandro Colomar (man-pages)
2021-07-07 17:23 ` strlen Alejandro Colomar (man-pages)
2021-07-07 17:33 ` strlen Alejandro Colomar (man-pages)
2021-07-09 13:48 ` strlen Jonny Grant
2021-07-08 10:07 ` strlen Jonny Grant
2021-07-08 11:06 ` strlen Alejandro Colomar (man-pages)
2021-07-08 12:13 ` strlen Xi Ruoyao
2021-07-08 23:49 ` strlen Segher Boessenkool
2021-07-09 13:54 ` strlen Jonny Grant
2021-07-09 14:17 ` strlen Alejandro Colomar (man-pages)
2021-07-09 16:11 ` strlen Xi Ruoyao
2021-07-10 1:00 ` strlen Segher Boessenkool
2021-07-09 10:50 ` strlen Jonny Grant
2021-07-09 11:27 ` Alejandro Colomar (man-pages) [this message]
2021-07-09 11:43 ` strlen Alejandro Colomar (man-pages)
[not found] ` <1627912755.3783669.1625745946723@mail.yahoo.com>
[not found] ` <59a70222-a46f-1e65-c9db-6c9e577c8adc@126.com>
2021-07-09 17:26 ` strlen Martin Sebor
2021-07-09 20:19 ` strlen Alejandro Colomar (man-pages)
2021-07-09 20:44 ` strlen Jonny Grant
2021-07-10 18:37 ` strlen Alejandro Colomar (man-pages)
2021-07-10 20:49 ` strlen Jonny Grant
2021-07-10 21:36 ` strlen Alejandro Colomar (man-pages)
2021-07-12 21:16 ` strlen Jonny Grant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0ea47fb3-9acc-3517-9593-debcdbce2dd4@gmail.com \
--to=alx.manpages@gmail.com \
--cc=fw@deneb.enyo.de \
--cc=gcc-help@gcc.gnu.org \
--cc=jg@jguk.org \
--cc=linux-man@vger.kernel.org \
--cc=mtk.manpages@gmail.com \
--cc=segher@kernel.crashing.org \
--cc=xry111@mengyan1223.wang \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).