* A few proposals from the C standards committee
@ 2024-01-23 16:46 Paul E. McKenney
2024-01-23 18:58 ` Linus Torvalds
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: Paul E. McKenney @ 2024-01-23 16:46 UTC (permalink / raw)
To: linux-toolchains; +Cc: peterz, hpa, rostedt, gregkh, keescook, torvalds
Hello!
On the perhaps unlikely off-chance that any of this is of interest.
Thanx, Paul
------------------------------------------------------------------------
List of proposals with clickable links:
https://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_log
N3089 _Optional: a type qualifier to indicate pointer nullability
Proposes _Optional to tag pointer parameters such that
dereferencing the pointer without first checking for NULL gets
a compiler warning.
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3089.pdf
N3190 Extensions to the preprocessor for C2Y
Proposes a number of macros, including things that return a
count of their arguments.
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3090.htm
N3194 Case range expressions
No fewer than 421 files in the Linux kernel use the "..." syntax,
as in "case 1 ... 3", but there are other syntaxes... So they
are proposing "::" instead. My guess is that "..." won't be
going away anytime soon.
N3195 Named loops
Placing a goto label before a loop allows a break/continue to
target that loop in case of nesting.
n3203 Strict order of expression evaluation
I do like it. The 1980s were over a long time ago.
N3199 Improved __attribute__((cleanup)) Through defer
N3198 Conditionally Supported Unwinding
The Linux kernel is starting to use __attribute__((cleanup))
via guard(), with 40 files making use of this. It is not clear
to me whether or not either of these proposals would be useful
to the Linux kernel.
N3201 Operator Overloading Without Name Mangling v2
I have seen Linux-kernel interest in *function* overloading, but
not in operator overloading. Nevertheless...
The trick here is to associate a given operator with a function,
so that the name-mangling becomes essentially a manual operation.
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: A few proposals from the C standards committee 2024-01-23 16:46 A few proposals from the C standards committee Paul E. McKenney @ 2024-01-23 18:58 ` Linus Torvalds 2024-01-23 20:00 ` Paul E. McKenney 2024-01-23 22:35 ` Martin Uecker 2024-01-23 20:16 ` H. Peter Anvin 2024-01-23 22:39 ` Kees Cook 2 siblings, 2 replies; 19+ messages in thread From: Linus Torvalds @ 2024-01-23 18:58 UTC (permalink / raw) To: paulmck; +Cc: linux-toolchains, peterz, hpa, rostedt, gregkh, keescook I generally like them, but.. On Tue, 23 Jan 2024 at 08:46, Paul E. McKenney <paulmck@kernel.org> wrote: > > N3089 _Optional: a type qualifier to indicate pointer nullability > Proposes _Optional to tag pointer parameters such that > dereferencing the pointer without first checking for NULL gets > a compiler warning. > https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3089.pdf This one I also like, but at the same time I'm not convinced "types" are the right way to carry this information. Because types are historically conceptually static and tied to the lifetime of the object. But the actual nullability logic must *not* be. _Nonnull is fine: if a variable is non-null, it can conceptually never become anything else (or rather: it would remain a bug if it did). So _Nonnull is a "statement of fact" about the variable, and makes sense as a type, and matches the lifetime of the variable. But the same is *not* true of _Nullable. The type magically and silently changes after a test. To make a trivial stupid example of what I mean, something like inline int access(int * _Nonnull p) { return *p; } ... int my_fn(int * _Nullable p) { return p ? access(p) : 0; } which is obviously correct, and shouldn't warn for anything, since this is literally what the whole thing is designed for. But part of that "shouldn't warn" is how a nullable 'p' is effectively silently cast to a non-nullable 'p'. The only thing that makes that cast valid is the presence of the conditional, but it should be noted that from a *type* perspective that is just wrong. IOW, normal types are carried along with their variables, but somehow the variable 'p' inside the conditional is not really of the the same type as 'p' outside of it. So that conditional has that hidden effect of changing what the type of 'p' is in all dependent expressions. And I know compilers already effectively implement all this, but I'm just saying that from a *type* system standpoint, this is all quite a bit illogical. In many ways, this is not a type issue, it's really a "value range analysis" issue. And I think it should be considered that waym, and the syntax and the logic be also talked about in those terms. Why would "_Nullable" and "_Nonnull" be conceptually any different from "I know this value is in the range [0..5]", which is *also* something that compilers already do, and that we also might want to be able to describe for warning purposes? So honestly, I would *love* to be able to give the compiler range information (which *includes* the "this is nullable" kind of information), but I don't think it should be described as a "type qualifier". Because what if the nullability is hidden in some called function? Tove give another example - less stuipid this time - think of somethign like this: int my_fn(int * _Nullable p) { if (check_validity(p)) return -EINVAL; return access(p); } where we have perhaps done extensive validity checks on 'p' (think the kernel kind of 'access_ok()' function) in the 'check_validity()' function, but the compiler doesn't see that function, since it's a rather complicated one that does a whole RB-tree lookup etc. So the compiler hasn't *seen* that we do a NULL check there. So it shouldn't warn, but it will - because the compiler is oblivious about the fact that the pointer has actually been checked for a lot more than just NULL. If you think of this as a "value analysis" issue, rather than as a type issue, the solution is obvious: it's not that the type of 'p' changes, but you just want a way to tell the compiler "I've done range checking, the new range is XYZ". And if you think of it that way, you don't want to re-decare a type, you want to just update range information, and simply state something like like _Nonnull p; after doing the check_validity() call. IOW, I really think you should be able to write something like int my_fn(int * _Nullable p) { if (check_validity(p)) return -EINVAL; _Nonnull p; return access(p); } See? My argument is basically that I like the _Nullable/_Nonnull attributes, but that they shouldn't be seen as part of the *type* system, but as a more dynamic value range thing, and that they can - and should - be available separately from just the declaration. Linus ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 18:58 ` Linus Torvalds @ 2024-01-23 20:00 ` Paul E. McKenney 2024-01-23 20:20 ` Linus Torvalds 2024-01-23 20:39 ` Linus Torvalds 2024-01-23 22:35 ` Martin Uecker 1 sibling, 2 replies; 19+ messages in thread From: Paul E. McKenney @ 2024-01-23 20:00 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-toolchains, peterz, hpa, rostedt, gregkh, keescook On Tue, Jan 23, 2024 at 10:58:04AM -0800, Linus Torvalds wrote: > I generally like them, but.. > > On Tue, 23 Jan 2024 at 08:46, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > N3089 _Optional: a type qualifier to indicate pointer nullability > > Proposes _Optional to tag pointer parameters such that > > dereferencing the pointer without first checking for NULL gets > > a compiler warning. > > https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3089.pdf > > This one I also like, but at the same time I'm not convinced "types" > are the right way to carry this information. > > Because types are historically conceptually static and tied to the > lifetime of the object. > > But the actual nullability logic must *not* be. > > _Nonnull is fine: if a variable is non-null, it can conceptually never > become anything else (or rather: it would remain a bug if it did). > > So _Nonnull is a "statement of fact" about the variable, and makes > sense as a type, and matches the lifetime of the variable. > > But the same is *not* true of _Nullable. The type magically and > silently changes after a test. > > To make a trivial stupid example of what I mean, something like > > inline int access(int * _Nonnull p) { return *p; } > ... > int my_fn(int * _Nullable p) > { return p ? access(p) : 0; } > > which is obviously correct, and shouldn't warn for anything, since > this is literally what the whole thing is designed for. > > But part of that "shouldn't warn" is how a nullable 'p' is effectively > silently cast to a non-nullable 'p'. The only thing that makes that > cast valid is the presence of the conditional, but it should be noted > that from a *type* perspective that is just wrong. > > IOW, normal types are carried along with their variables, but somehow > the variable 'p' inside the conditional is not really of the the same > type as 'p' outside of it. > > So that conditional has that hidden effect of changing what the type > of 'p' is in all dependent expressions. > > And I know compilers already effectively implement all this, but I'm > just saying that from a *type* system standpoint, this is all quite a > bit illogical. Would you be OK with something that required a new variable for the pointer that was now known not to be NULL? (My guess is "no", given the following discussion on value ranges, but I figured that I should ask.) > In many ways, this is not a type issue, it's really a "value range > analysis" issue. And I think it should be considered that waym, and > the syntax and the logic be also talked about in those terms. > > Why would "_Nullable" and "_Nonnull" be conceptually any different > from "I know this value is in the range [0..5]", which is *also* > something that compilers already do, and that we also might want to be > able to describe for warning purposes? > > So honestly, I would *love* to be able to give the compiler range > information (which *includes* the "this is nullable" kind of > information), but I don't think it should be described as a "type > qualifier". > > Because what if the nullability is hidden in some called function? > Tove give another example - less stuipid this time - think of > somethign like this: > > int my_fn(int * _Nullable p) > { > if (check_validity(p)) > return -EINVAL; > return access(p); > } > > where we have perhaps done extensive validity checks on 'p' (think the > kernel kind of 'access_ok()' function) in the 'check_validity()' > function, but the compiler doesn't see that function, since it's a > rather complicated one that does a whole RB-tree lookup etc. So the > compiler hasn't *seen* that we do a NULL check there. > > So it shouldn't warn, but it will - because the compiler is oblivious > about the fact that the pointer has actually been checked for a lot > more than just NULL. > > If you think of this as a "value analysis" issue, rather than as a > type issue, the solution is obvious: it's not that the type of 'p' > changes, but you just want a way to tell the compiler "I've done range > checking, the new range is XYZ". > > And if you think of it that way, you don't want to re-decare a type, > you want to just update range information, and simply state something > like like > > _Nonnull p; > > after doing the check_validity() call. IOW, I really think you should > be able to write something like > > int my_fn(int * _Nullable p) > { > if (check_validity(p)) > return -EINVAL; > _Nonnull p; > return access(p); > } > > See? My argument is basically that I like the _Nullable/_Nonnull > attributes, but that they shouldn't be seen as part of the *type* > system, but as a more dynamic value range thing, and that they can - > and should - be available separately from just the declaration. In some implementations, you can use assertions to get at least part of this effect: int my_fn(int * _Nullable p) { if (check_validity(p)) return -EINVAL; assert(p); return access(p); } And for your "[0..5]" example, assert(i >= 0 && i <=5): https://godbolt.org/z/xrdx1P3a8 In the kernel, we would of course need to have a way to tell the compiler about our assertions. The downside is that assert() will actually check the condition and emit code to invoke assert() if that condition is not met. So you are looking for something like assert, but which simply informs the compiler rather than doing the checking and calling? If so, then in clang and GCC there is __builtin_unreachable(): https://godbolt.org/z/9qrbGx848 (clang, works in clang 9 but not 8) https://godbolt.org/z/Kd44eTTWz (gcc, works also in gcc 8.1) Is something like that what you had in mind? Thanx, Paul ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 20:00 ` Paul E. McKenney @ 2024-01-23 20:20 ` Linus Torvalds 2024-01-23 20:35 ` Jakub Jelinek ` (2 more replies) 2024-01-23 20:39 ` Linus Torvalds 1 sibling, 3 replies; 19+ messages in thread From: Linus Torvalds @ 2024-01-23 20:20 UTC (permalink / raw) To: paulmck; +Cc: linux-toolchains, peterz, hpa, rostedt, gregkh, keescook On Tue, 23 Jan 2024 at 12:00, Paul E. McKenney <paulmck@kernel.org> wrote: > > Would you be OK with something that required a new variable for the > pointer that was now known not to be NULL? (My guess is "no", given the > following discussion on value ranges, but I figured that I should ask.) Yeah, no, I think that ends up putting the burden on the programmer in the form of a very cumbersome syntax, and just more room for mistakes. > In some implementations, you can use assertions to get at least part > of this effect: Yes. However, the problem with that is that the assert generally then comes with extra code generation. IOW, a plain _Nonnull p; in my opinion should imply a promise by the developer - and then you could have some "debug build" model where the compiler then verifies the promises. But an assert(p); implies more than a promise by the developer - it implies that the compiler *should* generate some code to verify. And yes, obviously assert() comes with the traditional NDEBUG flag, but that one has the historical baggage of causing the assert() to be a no-op. IOW, you lose the code generation, but you also lose the promise from the developer. Could all of this be done *properly*? Yes. And I think it should. But properly literally means having good documented "this is what this means". And no, __builtin_unreachable() is not it either, because it again has the same issue as "assert()" - in *practice* compilers can use it as a hint, but that's an incidental result, not part of a documented "this is how you specify a known range" So yes, I can do things like if (a < 0) __builtin_unreachable(); and it will generate the *code* that I want, but it sure as hell isn't some standard C syntax. Linus ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 20:20 ` Linus Torvalds @ 2024-01-23 20:35 ` Jakub Jelinek 2024-01-23 20:43 ` Linus Torvalds 2024-01-24 13:16 ` Paul E. McKenney 2024-01-23 20:44 ` H. Peter Anvin 2024-01-24 12:52 ` Paul E. McKenney 2 siblings, 2 replies; 19+ messages in thread From: Jakub Jelinek @ 2024-01-23 20:35 UTC (permalink / raw) To: Linus Torvalds Cc: paulmck, linux-toolchains, peterz, hpa, rostedt, gregkh, keescook On Tue, Jan 23, 2024 at 12:20:14PM -0800, Linus Torvalds wrote: > And no, __builtin_unreachable() is not it either, because it again has > the same issue as "assert()" - in *practice* compilers can use it as a > hint, but that's an incidental result, not part of a documented "this > is how you specify a known range" > > So yes, I can do things like > > if (a < 0) __builtin_unreachable(); > > and it will generate the *code* that I want, but it sure as hell isn't > some standard C syntax. C++23 has [[assume (condition)]]; for this (see https://wg21.link/p1774r8) and GCC supports it also as [[gnu::assume (condition)]] and __attribute__((assume (condition)));, both in C (the former only in C23) and C++. Side-effects in condition aren't evaluated, so it has different behavior from if (!(condition)) __builtin_unreachable (); Jakub ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 20:35 ` Jakub Jelinek @ 2024-01-23 20:43 ` Linus Torvalds 2024-01-23 20:46 ` H. Peter Anvin 2024-01-25 13:00 ` Paul E. McKenney 2024-01-24 13:16 ` Paul E. McKenney 1 sibling, 2 replies; 19+ messages in thread From: Linus Torvalds @ 2024-01-23 20:43 UTC (permalink / raw) To: Jakub Jelinek Cc: paulmck, linux-toolchains, peterz, hpa, rostedt, gregkh, keescook On Tue, 23 Jan 2024 at 12:36, Jakub Jelinek <jakub@redhat.com> wrote: > > C++23 has [[assume (condition)]]; for this (see https://wg21.link/p1774r8) > and GCC supports it also as [[gnu::assume (condition)]] and > __attribute__((assume (condition)));, both in C (the former only in C23) > and C++. Side-effects in condition aren't evaluated, so it has > different behavior from if (!(condition)) __builtin_unreachable (); That's lovely, and exactly the kind of thing I'd think is the rigth model. If you can also do it in a function declaration, so that it informs the caller, it's basically perfect. IOW, something like size_t strlen(const char *s [[assume(s)]]); would be the equivalent of "const char *_Nonnull s" in that callers could warn if not true. Except it also would work for other things, not just NULL pointers. Linus ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 20:43 ` Linus Torvalds @ 2024-01-23 20:46 ` H. Peter Anvin 2024-01-24 13:46 ` Paul E. McKenney 2024-01-25 13:00 ` Paul E. McKenney 1 sibling, 1 reply; 19+ messages in thread From: H. Peter Anvin @ 2024-01-23 20:46 UTC (permalink / raw) To: Linus Torvalds, Jakub Jelinek Cc: paulmck, linux-toolchains, peterz, rostedt, gregkh, keescook On 1/23/24 12:43, Linus Torvalds wrote: > On Tue, 23 Jan 2024 at 12:36, Jakub Jelinek <jakub@redhat.com> wrote: >> >> C++23 has [[assume (condition)]]; for this (see https://wg21.link/p1774r8) >> and GCC supports it also as [[gnu::assume (condition)]] and >> __attribute__((assume (condition)));, both in C (the former only in C23) >> and C++. Side-effects in condition aren't evaluated, so it has >> different behavior from if (!(condition)) __builtin_unreachable (); > > That's lovely, and exactly the kind of thing I'd think is the rigth model. > > If you can also do it in a function declaration, so that it informs > the caller, it's basically perfect. > > IOW, something like > > size_t strlen(const char *s [[assume(s)]]); > > would be the equivalent of "const char *_Nonnull s" in that callers > could warn if not true. > > Except it also would work for other things, not just NULL pointers. > This would *definitely* be frakking nice. -hpa ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 20:46 ` H. Peter Anvin @ 2024-01-24 13:46 ` Paul E. McKenney 0 siblings, 0 replies; 19+ messages in thread From: Paul E. McKenney @ 2024-01-24 13:46 UTC (permalink / raw) To: H. Peter Anvin Cc: Linus Torvalds, Jakub Jelinek, linux-toolchains, peterz, rostedt, gregkh, keescook On Tue, Jan 23, 2024 at 12:46:31PM -0800, H. Peter Anvin wrote: > On 1/23/24 12:43, Linus Torvalds wrote: > > On Tue, 23 Jan 2024 at 12:36, Jakub Jelinek <jakub@redhat.com> wrote: > > > > > > C++23 has [[assume (condition)]]; for this (see https://wg21.link/p1774r8) > > > and GCC supports it also as [[gnu::assume (condition)]] and > > > __attribute__((assume (condition)));, both in C (the former only in C23) > > > and C++. Side-effects in condition aren't evaluated, so it has > > > different behavior from if (!(condition)) __builtin_unreachable (); > > > > That's lovely, and exactly the kind of thing I'd think is the rigth model. > > > > If you can also do it in a function declaration, so that it informs > > the caller, it's basically perfect. > > > > IOW, something like > > > > size_t strlen(const char *s [[assume(s)]]); > > > > would be the equivalent of "const char *_Nonnull s" in that callers > > could warn if not true. > > > > Except it also would work for other things, not just NULL pointers. > > This would *definitely* be frakking nice. It would be!!! Sadly, if I am reading Section 4.9 correctly, https://wg21.link/p1774r8 proposes permitting [[assume()]] only as a statement, which would rule out appending it to a formal-parameter declaration. Their reason is lack of existing practice, for example, you cannot place __builtin_assume() in a formal parameter list. Compare this: https://godbolt.org/z/hjrzfsxjv To this: https://godbolt.org/z/8zdfzsjxe So if we want this to be added to the standard, we need to convince the compilers to allow it, then we could propose it. But couldn't we get the same behavior using static inline functions? size_t strlen_func(const char *s); static inline strlen(const char *s) { [[assume(s)]]; strlen_func(s); } The current intrinsics do seem to support this approach: https://godbolt.org/z/fEjcaorPP (GCC) https://godbolt.org/z/We8vv47v3 (clang) Thoughts? Thanx, Paul ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 20:43 ` Linus Torvalds 2024-01-23 20:46 ` H. Peter Anvin @ 2024-01-25 13:00 ` Paul E. McKenney 1 sibling, 0 replies; 19+ messages in thread From: Paul E. McKenney @ 2024-01-25 13:00 UTC (permalink / raw) To: Linus Torvalds Cc: Jakub Jelinek, linux-toolchains, peterz, hpa, rostedt, gregkh, keescook On Tue, Jan 23, 2024 at 12:43:02PM -0800, Linus Torvalds wrote: > On Tue, 23 Jan 2024 at 12:36, Jakub Jelinek <jakub@redhat.com> wrote: > > > > C++23 has [[assume (condition)]]; for this (see https://wg21.link/p1774r8) > > and GCC supports it also as [[gnu::assume (condition)]] and > > __attribute__((assume (condition)));, both in C (the former only in C23) > > and C++. Side-effects in condition aren't evaluated, so it has > > different behavior from if (!(condition)) __builtin_unreachable (); > > That's lovely, and exactly the kind of thing I'd think is the rigth model. > > If you can also do it in a function declaration, so that it informs > the caller, it's basically perfect. > > IOW, something like > > size_t strlen(const char *s [[assume(s)]]); > > would be the equivalent of "const char *_Nonnull s" in that callers > could warn if not true. > > Except it also would work for other things, not just NULL pointers. None of the current compilers support this, but it should not be hard to mechanically transform this to the form using static inlines, presumably with a made-up name for one level or the other </handwaving>. (Especially easy for all concerned if someone other than me does it, of course...) However, the possibility of pointers to these functions means that I must ask if this assume() is part of the type. There are a lot of reasons to *not* want it to be part of the type, but that would mean that calls through pointers would ignore that assume(). Thoughts? Thanx, Paul ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 20:35 ` Jakub Jelinek 2024-01-23 20:43 ` Linus Torvalds @ 2024-01-24 13:16 ` Paul E. McKenney 1 sibling, 0 replies; 19+ messages in thread From: Paul E. McKenney @ 2024-01-24 13:16 UTC (permalink / raw) To: Jakub Jelinek Cc: Linus Torvalds, linux-toolchains, peterz, hpa, rostedt, gregkh, keescook On Tue, Jan 23, 2024 at 09:35:46PM +0100, Jakub Jelinek wrote: > On Tue, Jan 23, 2024 at 12:20:14PM -0800, Linus Torvalds wrote: > > And no, __builtin_unreachable() is not it either, because it again has > > the same issue as "assert()" - in *practice* compilers can use it as a > > hint, but that's an incidental result, not part of a documented "this > > is how you specify a known range" > > > > So yes, I can do things like > > > > if (a < 0) __builtin_unreachable(); > > > > and it will generate the *code* that I want, but it sure as hell isn't > > some standard C syntax. > > C++23 has [[assume (condition)]]; for this (see https://wg21.link/p1774r8) > and GCC supports it also as [[gnu::assume (condition)]] and > __attribute__((assume (condition)));, both in C (the former only in C23) > and C++. Side-effects in condition aren't evaluated, so it has > different behavior from if (!(condition)) __builtin_unreachable (); The lack of side effects is quite nice, thank you for the pointer! This P1774R8 proposal explicitly calls out the possibility of assumptions propagating backwards, so there might also need to be a C-language counterpart to std::observable() [1] to block such propagation. In addition, if the [[assume()]] expression contains undefined behavior, that is explicitly allowed to "leak" out of that expression. So something like [[assume(*p)]] implies [[assume(p && *p)]] and something like [[assume(i / j)]] implies [[assume(j && i / j)]]. If this caused a problem, one alternative is to instead write [[assume(!p || *p)]] on the one hand or [[assume(!j || i / j)]] on the other. Thoughts? Thanx, Paul [1] https://wg21.link/p1494r2 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 20:20 ` Linus Torvalds 2024-01-23 20:35 ` Jakub Jelinek @ 2024-01-23 20:44 ` H. Peter Anvin 2024-01-24 12:52 ` Paul E. McKenney 2 siblings, 0 replies; 19+ messages in thread From: H. Peter Anvin @ 2024-01-23 20:44 UTC (permalink / raw) To: Linus Torvalds, paulmck Cc: linux-toolchains, peterz, rostedt, gregkh, keescook On 1/23/24 12:20, Linus Torvalds wrote: > > Yes. However, the problem with that is that the assert generally then > comes with extra code generation. > > IOW, a plain > > _Nonnull p; > > in my opinion should imply a promise by the developer - and then you > could have some "debug build" model where the compiler then verifies > the promises. > > But an > > assert(p); > > implies more than a promise by the developer - it implies that the > compiler *should* generate some code to verify. > > And yes, obviously assert() comes with the traditional NDEBUG flag, > but that one has the historical baggage of causing the assert() to be > a no-op. IOW, you lose the code generation, but you also lose the > promise from the developer. > > Could all of this be done *properly*? Yes. And I think it should. But > properly literally means having good documented "this is what this > means". > > And no, __builtin_unreachable() is not it either, because it again has > the same issue as "assert()" - in *practice* compilers can use it as a > hint, but that's an incidental result, not part of a documented "this > is how you specify a known range" > > So yes, I can do things like > > if (a < 0) __builtin_unreachable(); > > and it will generate the *code* that I want, but it sure as hell isn't > some standard C syntax. > C++23 adds [[assume(x)]]; which presumably will be "backported" to the C standard. This is basically MSVC's __assume() or gcc's __attribute__((assume())) which is otherwise exactly equivalent to using an if statement or && to invoke unreachable(); the latter has the advantage that if you use a macro you can replace unreachable() with something else for debugging purposes. unreachable() is in C23, in <stddef.h>, so if (a < 0) unreachable(); or ((a < 0) && unreachable()) actually *is* standard C syntax now... -hpa ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 20:20 ` Linus Torvalds 2024-01-23 20:35 ` Jakub Jelinek 2024-01-23 20:44 ` H. Peter Anvin @ 2024-01-24 12:52 ` Paul E. McKenney 2 siblings, 0 replies; 19+ messages in thread From: Paul E. McKenney @ 2024-01-24 12:52 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-toolchains, peterz, hpa, rostedt, gregkh, keescook On Tue, Jan 23, 2024 at 12:20:14PM -0800, Linus Torvalds wrote: [ . . . ] > Could all of this be done *properly*? Yes. And I think it should. But > properly literally means having good documented "this is what this > means". > > And no, __builtin_unreachable() is not it either, because it again has > the same issue as "assert()" - in *practice* compilers can use it as a > hint, but that's an incidental result, not part of a documented "this > is how you specify a known range" > > So yes, I can do things like > > if (a < 0) __builtin_unreachable(); > > and it will generate the *code* that I want, but it sure as hell isn't > some standard C syntax. Agreed, but the fact that it exists is nevertheless valuable. The reason is that it is *way* easier to get something into the C standard if the major implementations already do something supporting it. Given that GCC and clang do __builtin_unreachable() and the Microsoft compiler has __assume() [1], there is hope. Plus as Jakub noted, C++23 has [[assume()]], which provides even more hope, especially given that C and C++ put at some work into maintaining basic compatibility. That said, the compilers are likely to continue taking value ranges as a suggestion for optimization, especially at low optimization levels. Nevertheless, over time optimizers will continue to become more capable for both good and ill. :-/ Thanx, Paul [1] https://learn.microsoft.com/en-us/cpp/intrinsics/assume?view=msvc-170 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 20:00 ` Paul E. McKenney 2024-01-23 20:20 ` Linus Torvalds @ 2024-01-23 20:39 ` Linus Torvalds 1 sibling, 0 replies; 19+ messages in thread From: Linus Torvalds @ 2024-01-23 20:39 UTC (permalink / raw) To: paulmck; +Cc: linux-toolchains, peterz, hpa, rostedt, gregkh, keescook On Tue, 23 Jan 2024 at 12:00, Paul E. McKenney <paulmck@kernel.org> wrote: > > In some implementations, you can use assertions to get at least part > of this effect: Another note: 'assert()' doesn't work in the calling context. IOW, there is no way to 'assert()' that an incoming variable has a certain range, unless you start doing strange inline wrapper functions. So you can say assert(i >= 0 && i <= 5); to assert a range inside some function, but you can't do that in the declaration of the function to get a warning if the callers do something bad. And that's literally half the whole point of _Nullable and _Nonnull. You can give the value range description in the declaration of the function. The paper actually gives an example of a m,ore powerful syntax, which is admittedly not pretty, ie that whole const char src[static 1] that says "the argument is a pointer to an array of at least one character". Yes, the syntax is horrendous, and only works for pointers which is sad, but it's also an example of a fundamentally more powerful syntax. Wouldn't it be lovely to be able to just specify a valid range for integers too? Other languages have had it. YOu can actually get some of that bny using enum's, while again, that's not *documented*, and it's more of a "in practice you can use an enum and a compiler might assume the values are all valid". Linus ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 18:58 ` Linus Torvalds 2024-01-23 20:00 ` Paul E. McKenney @ 2024-01-23 22:35 ` Martin Uecker 1 sibling, 0 replies; 19+ messages in thread From: Martin Uecker @ 2024-01-23 22:35 UTC (permalink / raw) To: Linus Torvalds, paulmck Cc: linux-toolchains, peterz, hpa, rostedt, gregkh, keescook Am Dienstag, dem 23.01.2024 um 10:58 -0800 schrieb Linus Torvalds: > I generally like them, but.. > > On Tue, 23 Jan 2024 at 08:46, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > N3089 _Optional: a type qualifier to indicate pointer nullability > > Proposes _Optional to tag pointer parameters such that > > dereferencing the pointer without first checking for NULL gets > > a compiler warning. > > https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3089.pdf > > This one I also like, but at the same time I'm not convinced "types" > are the right way to carry this information. > > Because types are historically conceptually static and tied to the > lifetime of the object. > > But the actual nullability logic must *not* be. > > _Nonnull is fine: if a variable is non-null, it can conceptually never > become anything else (or rather: it would remain a bug if it did). > > So _Nonnull is a "statement of fact" about the variable, and makes > sense as a type, and matches the lifetime of the variable. > > But the same is *not* true of _Nullable. The type magically and > silently changes after a test. The interesting thing about _Optional is that it is not qualifier on the pointer but a qualifier on the target. So conversion from _Optional to a regular pointer would give you the diagnostic via the usual rules for pointer conversion. The paper then suggests &*ptr as syntax to transform the pointer with qualifier into a pointer without qualifier (after a check). Not sure the design is perfect, but it seems better than _Nonnull and _Nullable. Martin > > To make a trivial stupid example of what I mean, something like > > inline int access(int * _Nonnull p) { return *p; } > ... > int my_fn(int * _Nullable p) > { return p ? access(p) : 0; } > > which is obviously correct, and shouldn't warn for anything, since > this is literally what the whole thing is designed for. > > But part of that "shouldn't warn" is how a nullable 'p' is effectively > silently cast to a non-nullable 'p'. The only thing that makes that > cast valid is the presence of the conditional, but it should be noted > that from a *type* perspective that is just wrong. > > IOW, normal types are carried along with their variables, but somehow > the variable 'p' inside the conditional is not really of the the same > type as 'p' outside of it. > > So that conditional has that hidden effect of changing what the type > of 'p' is in all dependent expressions. > > And I know compilers already effectively implement all this, but I'm > just saying that from a *type* system standpoint, this is all quite a > bit illogical. > > In many ways, this is not a type issue, it's really a "value range > analysis" issue. And I think it should be considered that waym, and > the syntax and the logic be also talked about in those terms. > > Why would "_Nullable" and "_Nonnull" be conceptually any different > from "I know this value is in the range [0..5]", which is *also* > something that compilers already do, and that we also might want to be > able to describe for warning purposes? > > So honestly, I would *love* to be able to give the compiler range > information (which *includes* the "this is nullable" kind of > information), but I don't think it should be described as a "type > qualifier". > > Because what if the nullability is hidden in some called function? > Tove give another example - less stuipid this time - think of > somethign like this: > > int my_fn(int * _Nullable p) > { > if (check_validity(p)) > return -EINVAL; > return access(p); > } > > where we have perhaps done extensive validity checks on 'p' (think the > kernel kind of 'access_ok()' function) in the 'check_validity()' > function, but the compiler doesn't see that function, since it's a > rather complicated one that does a whole RB-tree lookup etc. So the > compiler hasn't *seen* that we do a NULL check there. > > So it shouldn't warn, but it will - because the compiler is oblivious > about the fact that the pointer has actually been checked for a lot > more than just NULL. > > If you think of this as a "value analysis" issue, rather than as a > type issue, the solution is obvious: it's not that the type of 'p' > changes, but you just want a way to tell the compiler "I've done range > checking, the new range is XYZ". > > And if you think of it that way, you don't want to re-decare a type, > you want to just update range information, and simply state something > like like > > _Nonnull p; > > after doing the check_validity() call. IOW, I really think you should > be able to write something like > > int my_fn(int * _Nullable p) > { > if (check_validity(p)) > return -EINVAL; > _Nonnull p; > return access(p); > } > > See? My argument is basically that I like the _Nullable/_Nonnull > attributes, but that they shouldn't be seen as part of the *type* > system, but as a more dynamic value range thing, and that they can - > and should - be available separately from just the declaration. > > Linus ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 16:46 A few proposals from the C standards committee Paul E. McKenney 2024-01-23 18:58 ` Linus Torvalds @ 2024-01-23 20:16 ` H. Peter Anvin 2024-01-23 20:24 ` Linus Torvalds 2024-01-25 12:52 ` Paul E. McKenney 2024-01-23 22:39 ` Kees Cook 2 siblings, 2 replies; 19+ messages in thread From: H. Peter Anvin @ 2024-01-23 20:16 UTC (permalink / raw) To: paulmck, linux-toolchains; +Cc: peterz, rostedt, gregkh, keescook, torvalds On 1/23/24 08:46, Paul E. McKenney wrote: > Hello! > > On the perhaps unlikely off-chance that any of this is of interest. > > Thanx, Paul On the contrary, I find it quite interesting. I have been in contact with both the C and C++ committee. > ------------------------------------------------------------------------ > > List of proposals with clickable links: > > https://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_log > > N3089 _Optional: a type qualifier to indicate pointer nullability > Proposes _Optional to tag pointer parameters such that > dereferencing the pointer without first checking for NULL gets > a compiler warning. > https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3089.pdf > > N3190 Extensions to the preprocessor for C2Y > Proposes a number of macros, including things that return a > count of their arguments. > https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3090.htm Some of these are *extremely* useful; in fact I believe I asked for some of these when I previously contacted one of the C committee members. One big motivator is making a size-safe printf(). That being said, they are missing some important bits, in particular #embed needs to be able to be expressed as _Embed() for the same reason that #pragma has _Pragma(); in fact #embed needs it even more, as if there is something you really want to macroize. #do and #foreach are mentioned but not defined. I'm wondering how useful these are if they can't be macroized themselves. At that point it might be better to have a proper macro function language. > N3194 Case range expressions > No fewer than 421 files in the Linux kernel use the "..." syntax, > as in "case 1 ... 3", but there are other syntaxes... So they > are proposing "::" instead. My guess is that "..." won't be > going away anytime soon. :: would be a disaster for C++ compatibility, and I'm feeling that C might end up needing to support C++ namespaces or some other mechanism like that. .. would be better if ... is unacceptable, or [foo,bar]. Inconsistent with range syntax for initializers if that is standard (I don't remember.) > N3195 Named loops > Placing a goto label before a loop allows a break/continue to > target that loop in case of nesting. ... which so many languages already support as an extension. > n3203 Strict order of expression evaluation > I do like it. The 1980s were over a long time ago. The question is: is this going to wreck havoc with performance. The C++ reference implies it won't, though. > N3199 Improved __attribute__((cleanup)) Through defer > N3198 Conditionally Supported Unwinding > The Linux kernel is starting to use __attribute__((cleanup)) > via guard(), with 40 files making use of this. It is not clear > to me whether or not either of these proposals would be useful > to the Linux kernel. > > N3201 Operator Overloading Without Name Mangling v2 > I have seen Linux-kernel interest in *function* overloading, but > not in operator overloading. Nevertheless... > > The trick here is to associate a given operator with a function, > so that the name-mangling becomes essentially a manual operation. It's kind of odd. It feels a bit like doing C++ backwards... Thanks for the heads-up. I think I'm going to reach out and chat with these folks. -hpa ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 20:16 ` H. Peter Anvin @ 2024-01-23 20:24 ` Linus Torvalds 2024-01-24 14:58 ` Paul E. McKenney 2024-01-25 12:52 ` Paul E. McKenney 1 sibling, 1 reply; 19+ messages in thread From: Linus Torvalds @ 2024-01-23 20:24 UTC (permalink / raw) To: H. Peter Anvin Cc: paulmck, linux-toolchains, peterz, rostedt, gregkh, keescook On Tue, 23 Jan 2024 at 12:19, H. Peter Anvin <hpa@zytor.com> wrote: > > > n3203 Strict order of expression evaluation > > I do like it. The 1980s were over a long time ago. > > The question is: is this going to wreck havoc with performance. The C++ > reference implies it won't, though. Well, they also had numbers from an actual implementation showing that it didn't (ie "win some, lose some"). The "ordering is undefined" is, I think, almost entirely an effect of "compilers weren't that smart, and implementations differed". So I'd love for sequence points to go away. They are one of the more subtle parts of C, and I do not believe that they have any real advantage any more. (And by "go away" I obviously mean "everything is a sequence point", not "nothing is a sequence point" - so they'd go away as a concept, because they'd become a non-issue). Linus ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 20:24 ` Linus Torvalds @ 2024-01-24 14:58 ` Paul E. McKenney 0 siblings, 0 replies; 19+ messages in thread From: Paul E. McKenney @ 2024-01-24 14:58 UTC (permalink / raw) To: Linus Torvalds Cc: H. Peter Anvin, linux-toolchains, peterz, rostedt, gregkh, keescook On Tue, Jan 23, 2024 at 12:24:44PM -0800, Linus Torvalds wrote: > On Tue, 23 Jan 2024 at 12:19, H. Peter Anvin <hpa@zytor.com> wrote: > > > > > n3203 Strict order of expression evaluation > > > I do like it. The 1980s were over a long time ago. > > > > The question is: is this going to wreck havoc with performance. The C++ > > reference implies it won't, though. > > Well, they also had numbers from an actual implementation showing that > it didn't (ie "win some, lose some"). > > The "ordering is undefined" is, I think, almost entirely an effect of > "compilers weren't that smart, and implementations differed". > > So I'd love for sequence points to go away. They are one of the more > subtle parts of C, and I do not believe that they have any real > advantage any more. > > (And by "go away" I obviously mean "everything is a sequence point", > not "nothing is a sequence point" - so they'd go away as a concept, > because they'd become a non-issue). Agreed, and here is hoping! Thanx, Paul ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 20:16 ` H. Peter Anvin 2024-01-23 20:24 ` Linus Torvalds @ 2024-01-25 12:52 ` Paul E. McKenney 1 sibling, 0 replies; 19+ messages in thread From: Paul E. McKenney @ 2024-01-25 12:52 UTC (permalink / raw) To: H. Peter Anvin Cc: linux-toolchains, peterz, rostedt, gregkh, keescook, torvalds On Tue, Jan 23, 2024 at 12:16:37PM -0800, H. Peter Anvin wrote: > On 1/23/24 08:46, Paul E. McKenney wrote: [ . . . ] > > N3201 Operator Overloading Without Name Mangling v2 > > I have seen Linux-kernel interest in *function* overloading, but > > not in operator overloading. Nevertheless... > > > > The trick here is to associate a given operator with a function, > > so that the name-mangling becomes essentially a manual operation. > > It's kind of odd. It feels a bit like doing C++ backwards... > > Thanks for the heads-up. I think I'm going to reach out and chat with these > folks. In the discussion, the clang "__attribute((overloadable))" came up. Does that do what you want? If so, perhaps GCC can be persuaded to add it. Thanx, Paul ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: A few proposals from the C standards committee 2024-01-23 16:46 A few proposals from the C standards committee Paul E. McKenney 2024-01-23 18:58 ` Linus Torvalds 2024-01-23 20:16 ` H. Peter Anvin @ 2024-01-23 22:39 ` Kees Cook 2 siblings, 0 replies; 19+ messages in thread From: Kees Cook @ 2024-01-23 22:39 UTC (permalink / raw) To: Paul E. McKenney Cc: linux-toolchains, peterz, hpa, rostedt, gregkh, ndesaulniers, justinstitt, torvalds, linux-hardening On Tue, Jan 23, 2024 at 08:46:13AM -0800, Paul E. McKenney wrote: > N3201 Operator Overloading Without Name Mangling v2 https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3201.pdf > I have seen Linux-kernel interest in *function* overloading, but > not in operator overloading. Nevertheless... > > The trick here is to associate a given operator with a function, > so that the name-mangling becomes essentially a manual operation. The proposal discusses strings, but I would want to immediately use this for handling wrap vs trap arithmetic (rather than using sanitizers[1]). -- Kees Cook ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2024-01-25 13:00 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-01-23 16:46 A few proposals from the C standards committee Paul E. McKenney 2024-01-23 18:58 ` Linus Torvalds 2024-01-23 20:00 ` Paul E. McKenney 2024-01-23 20:20 ` Linus Torvalds 2024-01-23 20:35 ` Jakub Jelinek 2024-01-23 20:43 ` Linus Torvalds 2024-01-23 20:46 ` H. Peter Anvin 2024-01-24 13:46 ` Paul E. McKenney 2024-01-25 13:00 ` Paul E. McKenney 2024-01-24 13:16 ` Paul E. McKenney 2024-01-23 20:44 ` H. Peter Anvin 2024-01-24 12:52 ` Paul E. McKenney 2024-01-23 20:39 ` Linus Torvalds 2024-01-23 22:35 ` Martin Uecker 2024-01-23 20:16 ` H. Peter Anvin 2024-01-23 20:24 ` Linus Torvalds 2024-01-24 14:58 ` Paul E. McKenney 2024-01-25 12:52 ` Paul E. McKenney 2024-01-23 22:39 ` Kees Cook
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).