From: Alejandro Colomar <alx.manpages@gmail.com>
To: Martin Uecker <uecker@tugraz.at>, Joseph Myers <joseph@codesourcery.com>
Cc: Ingo Schwarze <schwarze@usta.de>,
JeanHeyd Meneide <wg14@soasis.org>,
linux-man@vger.kernel.org, gcc@gcc.gnu.org
Subject: Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters
Date: Sun, 13 Nov 2022 14:19:24 +0100 [thread overview]
Message-ID: <7931044a-b707-5a70-86c2-be298c35aa57@gmail.com> (raw)
In-Reply-To: <ceb7e51c8f01cb3c7069f3212a7e86e4b10e320d.camel@tugraz.at>
[-- Attachment #1.1: Type: text/plain, Size: 6674 bytes --]
Hi Martin!
On 11/12/22 16:56, Martin Uecker wrote:
> Am Samstag, den 12.11.2022, 14:54 +0000 schrieb Joseph Myers:
>> On Sat, 12 Nov 2022, Alejandro Colomar via Gcc wrote:
>>
>>> Since it's to be used as an rvalue, not as a lvalue, I guess a
>>> postfix-expression wouldn't be the right one.
>>
>> Several forms of postfix-expression are only rvalues.
>>
>>>> (with a special rule about how the identifier is interpreted, different
>>>> from the normal scope rules)? If so, then ".a = 1" could either match
>>>> assignment-expression directly (assigning to the postfix-expression ".a").
>>>
>>> No, assigning to a function parameter from within another parameter
>>> declaration wouldn't make sense. They should be readonly. Side effects
>>> should be forbidden, I think.
>>
>> Such assignments are already allowed. In a function definition, the side
>> effects (including in size expressions for array parameters adjusted to
>> pointers) take place before entry to the function body.
>>
>> And, in any case, if you did have a constraint disallowing such
>> assignments, it wouldn't suffice for syntactic disambiguation (see the
>> previous point I made about that; I have some rough notes towards a WG14
>> paper on syntactic disambiguation, but haven't converted them into a
>> coherent paper).
>
> My idea was to only allow
>
> array-declarator : direct-declarator [ . identifier ]
>
> and only for parameter (not nested inside structs declared
> in parameter list) as a first step because it seems this
> would exclude all difficult cases.
>
> But if we need to allow more complicated expressions, then
> it starts getting more complicated.
Ahh, I guess my work in documenting the man-pages prototypes got me thinking of
those extensions to the idea. I don't remember all the details :)
>
> One could could allow more generic expressions, and
> specify that the .identifier refers to a
> parameter in
> the nearest lexically enclosing parameter list or
> struct/union.
>
> Then
>
> void foo(struct bar { int x; char c[.x] } a, int x);
>
> would not be allowed (which is good because then we
> could later use the syntax also inside structs). If
> we apply scoping rules, the following would work:
>
> struct bar { int y; };
> void foo(char p[((struct bar){ .y = .x }).y], int x);
Makes sense.
>
> But not:
>
> struct bar { int y; };
> void foo(char p[((struct bar){ .y = .y }).y], int y);
Although it clearly is nonsense, I'm not sure I'd make it a constraint
violation, but rather Undefined Behavior. How is it different than this?:
$ cat foo.c
int main(void)
{
int i = i;
return i;
}
$ gcc --version | head -n1
gcc (Debian 12.2.0-9) 12.2.0
$ gcc -Wall -Wextra -Werror foo.c
$
$ clang --version | head -n1
Debian clang version 14.0.6
$ clang -Wall -Wextra -Werror foo.c
foo.c:3:10: error: variable 'i' is uninitialized when used within its own
initialization [-Werror,-Wuninitialized]
int i = i;
~ ^
1 error generated.
BTW, I just freaked out that GCC can't catch this trivial bug. Should I open a
bug report?
>
>
> But there are not only syntactical problems, because
> also the type of the parameter might become relevant
> and then you can get circular dependencies:
>
> void foo(char (*a)[sizeof *.b], char (*b)[sizeof *.a]);
This seems to be a difficult stone in the road.
>
> I am not sure what would the best way to fix it. One
> could specifiy that parameters referred to by
> the .identifer syntax must of some integer type and
> that the sub-expression .identifer is always
> converted to a 'size_t'.
That makes sense, but then overnight some quite useful thing came to my mind
that would not be possible with this limitation:
<https://software.codidact.com/posts/285946>
char *
stpecpy(char dst[.end - .dst], char *src, char end[1])
{
for (/* void */; dst <= end; dst++) {
*dst = *src++;
if (*dst == '\0')
return dst;
}
/* Truncation detected */
*end = '\0';
#if !defined(NDEBUG)
/* Consume the rest of the input string. */
while (*src++) {};
#endif
return end + 1;
}
stpecpy() is a function similar to strlcat(3) that gets a pointer to the end of
the array instead of the size of the buffer. This allows chaining without
having performance issues[1].
[1]: <https://en.wikichip.org/wiki/schlemiel_the_painter%27s_algorithm>
Maybe allowing integral types and pointers would be enough. However, foreseeing
that the _Lengthof() proposal (BTW, which paper was it?) will succeed, and
combining it with this one, _Lengthof(pointer) would ideally give the length of
the array, so allowing pointers would conflict.
My solution is to disallow sizeof() and _Lengthof() on .identifier. That could
be done simply by saying that variably-modified types (VMT) are incomplete types
until immediately after the comma that follows the parameter declaration.
Therefore it would be allowed only in the same way as it is allowed right now
with the normal syntax (i.e., after the parameter has been seen).
BTW, what was the number of the latest paper for _Lengthof() and what happened
to it? I guess it's likely to be added to C3x, isn't it?
And another BTW: there's some kind of consistency in (some) projects for naming
sizes, and I have pending a review of the Linux man-pages to make it consistent
there too.
See the following table of usual conventions:
Operator/macro: variable names; Description.
------------------------------|------------------|---------------------
strlen(3): length, len, l; String length.
sizeof(): size, sz, nbytes; Identifier size in bytes.
nitems(), nelems(): n, nelem, nitems; Array number of elements.
sizeof_array(), array_bytes(): size, sz, nbytes; Array size in bytes.
Naming _Lengthof() the operator that gets the number of elements in an array
would create naming confusion, since then length can mean two different things.
I suggest _Nitemsof().
>
> Maybe one should also add a constraint that all new
> type length expressions, i.e. using the syntax,
> can not have side effects. Or even that they follow
> all the rules of integer constant expressions with
> the fictitious assumption that all . identifer
> sub-expressions are integer constant expressions.
> The rationale being that this would facilitate
> compile time reasoning about length expressions.
>
>
> Martin
>
Cheers,
Alex
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2022-11-13 13:19 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-26 21:07 [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters Alejandro Colomar
2022-08-27 11:10 ` Ingo Schwarze
2022-08-27 12:15 ` Alejandro Colomar
2022-08-27 13:08 ` Ingo Schwarze
2022-08-27 18:38 ` Alejandro Colomar
2022-08-28 11:24 ` Alejandro Colomar
[not found] ` <CACqA6+mfaj6Viw+LVOG=nE350gQhCwVKXRzycVru5Oi4EJzgTg@mail.gmail.com>
2022-09-02 21:02 ` Alejandro Colomar
2022-09-02 21:57 ` Alejandro Colomar
2022-09-03 12:47 ` Martin Uecker
2022-09-03 13:29 ` Ingo Schwarze
2022-09-03 15:08 ` Alejandro Colomar
2022-09-03 13:41 ` Alejandro Colomar
2022-09-03 14:35 ` Martin Uecker
2022-09-03 14:59 ` Alejandro Colomar
2022-09-03 15:31 ` Martin Uecker
2022-09-03 20:02 ` Alejandro Colomar
2022-09-05 14:31 ` Alejandro Colomar
2022-11-10 0:06 ` Alejandro Colomar
2022-11-10 0:09 ` Alejandro Colomar
2022-11-10 1:33 ` Joseph Myers
2022-11-10 1:39 ` Joseph Myers
2022-11-10 6:21 ` Martin Uecker
2022-11-10 10:09 ` Alejandro Colomar
2022-11-10 23:19 ` Joseph Myers
2022-11-10 23:28 ` Alejandro Colomar
2022-11-11 19:52 ` Martin Uecker
2022-11-12 1:09 ` Joseph Myers
2022-11-12 7:24 ` Martin Uecker
2022-11-12 12:34 ` Alejandro Colomar
2022-11-12 12:46 ` Alejandro Colomar
2022-11-12 13:03 ` Joseph Myers
2022-11-12 13:40 ` Alejandro Colomar
2022-11-12 13:58 ` Alejandro Colomar
2022-11-12 14:54 ` Joseph Myers
2022-11-12 15:35 ` Alejandro Colomar
2022-11-12 17:02 ` Joseph Myers
2022-11-12 17:08 ` Alejandro Colomar
2022-11-12 15:56 ` Martin Uecker
2022-11-13 13:19 ` Alejandro Colomar [this message]
2022-11-13 13:33 ` Alejandro Colomar
2022-11-13 14:02 ` Alejandro Colomar
2022-11-13 14:58 ` Martin Uecker
2022-11-13 15:15 ` Alejandro Colomar
2022-11-13 15:32 ` Martin Uecker
2022-11-13 16:25 ` Alejandro Colomar
2022-11-13 16:28 ` Alejandro Colomar
2022-11-13 16:31 ` Alejandro Colomar
2022-11-13 16:34 ` Alejandro Colomar
2022-11-13 16:56 ` Alejandro Colomar
2022-11-13 19:05 ` Alejandro Colomar
2022-11-14 18:13 ` Joseph Myers
2022-11-28 22:59 ` Alex Colomar
2022-11-28 23:18 ` Alex Colomar
2022-11-29 0:05 ` Joseph Myers
2022-11-29 14:58 ` Michael Matz
2022-11-29 15:17 ` Uecker, Martin
2022-11-29 15:44 ` Michael Matz
2022-11-29 16:58 ` Uecker, Martin
2022-11-29 17:28 ` Alex Colomar
2022-11-29 16:49 ` Joseph Myers
2022-11-29 16:53 ` Jonathan Wakely
2022-11-29 17:00 ` Martin Uecker
2022-11-29 17:19 ` Alex Colomar
2022-11-29 17:29 ` Alex Colomar
2022-12-03 21:03 ` Alejandro Colomar
2022-12-03 21:13 ` Andrew Pinski
2022-12-03 21:15 ` Martin Uecker
2022-12-03 21:18 ` Alejandro Colomar
2022-12-06 2:08 ` Joseph Myers
2022-11-14 17:52 ` Joseph Myers
2022-11-14 17:57 ` Alejandro Colomar
2022-11-14 18:26 ` Joseph Myers
2022-11-28 23:02 ` Alex Colomar
2022-11-10 9:40 ` G. Branden Robinson
2022-11-10 10:59 ` Alejandro Colomar
2022-11-10 17:47 ` Alejandro Colomar
2022-11-10 18:04 ` MR macro 4th argument (was: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters) Alejandro Colomar
2022-11-10 18:11 ` Alejandro Colomar
2022-11-10 18:20 ` Alejandro Colomar
2022-11-10 19:37 ` Alejandro Colomar
2022-11-10 20:41 ` Alejandro Colomar
2022-11-10 22:55 ` G. Branden Robinson
2022-11-10 23:55 ` Alejandro Colomar
2022-11-11 4:44 ` G. Branden Robinson
2022-11-10 22:25 ` [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters G. Branden Robinson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7931044a-b707-5a70-86c2-be298c35aa57@gmail.com \
--to=alx.manpages@gmail.com \
--cc=gcc@gcc.gnu.org \
--cc=joseph@codesourcery.com \
--cc=linux-man@vger.kernel.org \
--cc=schwarze@usta.de \
--cc=uecker@tugraz.at \
--cc=wg14@soasis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox