linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Metalanguage for the Linux UAPI
@ 2025-05-15 20:04 H. Peter Anvin
  2025-05-15 20:26 ` enh
  2025-05-15 21:20 ` Linus Torvalds
  0 siblings, 2 replies; 11+ messages in thread
From: H. Peter Anvin @ 2025-05-15 20:04 UTC (permalink / raw)
  To: Arnd Bergmann, LKML, Linus Torvalds, libc-alpha, linux-arch

OK, so this is something I have been thinking about for quite a while. 
It would be a quite large project, so I would like to hear people's 
opinions on it before even starting.

We have finally succeeded in divorcing the Linux UAPI from the general 
kernel headers, but even so, there are a lot of things in the UAPI that 
means it is not possible for an arbitrary libc to use it directly; for 
example "struct termios" is not the glibc "struct termios", but 
redefining it breaks the ioctl numbering unless the ioctl headers are 
changed as well, and so on. However, other libcs want to use the struct 
termios as defined in the kernel, or, more likely, struct termios2.

Furthermore, I was looking further into how C++ templates could be used 
to make user pointers inherently safe and probably more efficient, but 
ran into the problem that you really want to be able to convert a 
user-tagged structure to a structure with "safe-user-tagged" members 
(after access_ok), which turned out not to be trivially supportable even 
after the latest C++ modernizations (without which I don't consider C++ 
viable at all; I would not consider versions of C++ before C++17 worthy 
of even looking at; C++20 preferred.)

And it is not just generation of in-kernel versus out-of-kernel headers 
that is an issue (which we have managed to deal with pretty well.) There 
generally isn't enough information in C headers alone to do well at 
creating bindings for other languages, *especially* given how many 
constants are defined in terms of macros.

The use of C also makes it hard to mangle the headers for user space. 
For example, glibc has to add __extension__ before anonymous struct or 
union members in order to be able to compile in strict C90 mode.

I have been considering if it would make sense to create more of a 
metalanguage for the Linux UAPI. This would be run through a more 
advanced preprocessor than cpp written in C and yacc/bison. (It could 
also be done via a gcc plugin or a DWARF parser, but I do not like tying 
this to compiler internals, and DWARF parsing is probably more complex 
and less versatile.)

It could thus provide things like "true" constants (constexpr for C++11 
or C23, or enums), bitfield macro explosions and so on, depending on 
what the backend user would like: namespacing, distributed enumerations, 
and assembly offset constants, and even possibly syscall stubs.

There is of course no reason such a generator couldn't be used for 
kernel-only headers at some point, but I am concentrating on the

Another major motivation is to be able to include one named struct 
anonymously inside another, without having to repeat the definition. 
(This is not supported in standard C or GNU C; MS C supports it as an 
extension, and I have requested that it be added into GNU C which would 
also allow it to be used with __extension__, and perhaps get folded into 
a future C standard since it would now fit the criterion of more than 
one implementation; however, the runway for being able to use that in 
UAPI headers is quite long.)

I obviously want to keep a C-like syntax for this, which is a major 
reason for using a parser like yacc/bison.

I have done such a project in the past, with some good success. That 
being said, the requirements for the Linux UAPI language are obviously 
much more complex. A few things I have considered are wanting to be able 
to namespace constants or, more or less equivalently, create 
enumerations in bits and pieces (consider ioctl constants, for example) 
and have them coalesce into a single definition if appropriate for the 
target language.

Speaking of ioctl constants: one of the current problems is that a fair 
number of ioctl constants do not have the size/type annotations, and 
perhaps worse, it is impossible to tell from just the numeric value 
(since _IOC_NONE expands to 0, an _IO() ioctl ends up having no type 
information at all.) This is something that *definitely* ought to be 
added, even if a certain backend cannot preserve that information

Thoughts?

	-hpa


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Metalanguage for the Linux UAPI
  2025-05-15 20:04 Metalanguage for the Linux UAPI H. Peter Anvin
@ 2025-05-15 20:26 ` enh
  2025-05-15 21:24   ` H. Peter Anvin
  2025-05-15 21:20 ` Linus Torvalds
  1 sibling, 1 reply; 11+ messages in thread
From: enh @ 2025-05-15 20:26 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Arnd Bergmann, LKML, Linus Torvalds, libc-alpha, linux-arch

On Thu, May 15, 2025 at 4:05 PM H. Peter Anvin <hpa@zytor.com> wrote:
>
> OK, so this is something I have been thinking about for quite a while.
> It would be a quite large project, so I would like to hear people's
> opinions on it before even starting.
>
> We have finally succeeded in divorcing the Linux UAPI from the general
> kernel headers, but even so, there are a lot of things in the UAPI that
> means it is not possible for an arbitrary libc to use it directly; for
> example "struct termios" is not the glibc "struct termios", but
> redefining it breaks the ioctl numbering unless the ioctl headers are
> changed as well, and so on. However, other libcs want to use the struct
> termios as defined in the kernel, or, more likely, struct termios2.

bionic is a ("the only"?) libc that tries to not duplicate _anything_
and always defer to the uapi headers. we have quite an extensive list
of hacks we need to apply to rewrite the uapi headers into something
directly usable (and a lot of awful python to apply those hacks):

https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/kernel/tools/defaults.py

a lot are just name collisions ("you say 'class', my c++ compiler says
wtf?!"), but there are a few "posix and linux disagree"s too. (other
libcs that weren't linux-only from day one might have more conflicts,
such as a comically large sigset_t, say :-) )

but i think most if not all of that could be fixed upstream, given the will?

(though some c programmers do still get upset if told they shouldn't
use c++ keywords as identifiers, i note that the uapi headers _were_
recently fixed to avoid a c extension that's invalid c++. thanks,
anyone involved in that who's reading this!)

> Furthermore, I was looking further into how C++ templates could be used
> to make user pointers inherently safe and probably more efficient, but
> ran into the problem that you really want to be able to convert a
> user-tagged structure to a structure with "safe-user-tagged" members
> (after access_ok), which turned out not to be trivially supportable even
> after the latest C++ modernizations (without which I don't consider C++
> viable at all; I would not consider versions of C++ before C++17 worthy
> of even looking at; C++20 preferred.)

(/me assumes you're just trolling linus with this.)

> And it is not just generation of in-kernel versus out-of-kernel headers
> that is an issue (which we have managed to deal with pretty well.) There
> generally isn't enough information in C headers alone to do well at
> creating bindings for other languages, *especially* given how many
> constants are defined in terms of macros.

(yeah, while i think the _c_ [and c++] problems could be solved much
more easily, solving the swift/rust/golang duplication of all that
stuff is a whole other thing. i'd try to sign up one of those
languages' library's maintainers before investing too much in having
another representation of the uapi though...)

> The use of C also makes it hard to mangle the headers for user space.
> For example, glibc has to add __extension__ before anonymous struct or
> union members in order to be able to compile in strict C90 mode.

(again, that one seems easily fixable upstream.)

> I have been considering if it would make sense to create more of a
> metalanguage for the Linux UAPI. This would be run through a more
> advanced preprocessor than cpp written in C and yacc/bison. (It could
> also be done via a gcc plugin or a DWARF parser, but I do not like tying
> this to compiler internals, and DWARF parsing is probably more complex
> and less versatile.)
>
> It could thus provide things like "true" constants (constexpr for C++11
> or C23, or enums), bitfield macro explosions and so on, depending on
> what the backend user would like: namespacing, distributed enumerations,
> and assembly offset constants, and even possibly syscall stubs.

(given a clean slate that wouldn't be terrible, but you get a lot of
#if nonsense. though the `#define foo foo` trick lets you have the
best of both worlds [at some cost to compile time].)

> There is of course no reason such a generator couldn't be used for
> kernel-only headers at some point, but I am concentrating on the
>
> Another major motivation is to be able to include one named struct
> anonymously inside another, without having to repeat the definition.
> (This is not supported in standard C or GNU C; MS C supports it as an
> extension, and I have requested that it be added into GNU C which would
> also allow it to be used with __extension__, and perhaps get folded into
> a future C standard since it would now fit the criterion of more than
> one implementation; however, the runway for being able to use that in
> UAPI headers is quite long.)
>
> I obviously want to keep a C-like syntax for this, which is a major
> reason for using a parser like yacc/bison.
>
> I have done such a project in the past, with some good success. That
> being said, the requirements for the Linux UAPI language are obviously
> much more complex. A few things I have considered are wanting to be able
> to namespace constants or, more or less equivalently, create
> enumerations in bits and pieces (consider ioctl constants, for example)
> and have them coalesce into a single definition if appropriate for the
> target language.
>
> Speaking of ioctl constants: one of the current problems is that a fair
> number of ioctl constants do not have the size/type annotations, and
> perhaps worse, it is impossible to tell from just the numeric value
> (since _IOC_NONE expands to 0, an _IO() ioctl ends up having no type
> information at all.) This is something that *definitely* ought to be
> added, even if a certain backend cannot preserve that information
>
> Thoughts?
>
>         -hpa
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Metalanguage for the Linux UAPI
  2025-05-15 20:04 Metalanguage for the Linux UAPI H. Peter Anvin
  2025-05-15 20:26 ` enh
@ 2025-05-15 21:20 ` Linus Torvalds
  2025-05-15 21:42   ` H. Peter Anvin
  1 sibling, 1 reply; 11+ messages in thread
From: Linus Torvalds @ 2025-05-15 21:20 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Arnd Bergmann, LKML, libc-alpha, linux-arch

On Thu, 15 May 2025 at 13:05, H. Peter Anvin <hpa@zytor.com> wrote:
>
> We have finally succeeded in divorcing the Linux UAPI from the general
> kernel headers, but even so, there are a lot of things in the UAPI that
> means it is not possible for an arbitrary libc to use it directly; for
> example "struct termios" is not the glibc "struct termios", but
> redefining it breaks the ioctl numbering unless the ioctl headers are
> changed as well, and so on. However, other libcs want to use the struct
> termios as defined in the kernel, or, more likely, struct termios2.

Honestly, I *really* don't want to go down that rat-hole.

It's going to be full of random project-specific issues, and the
bigger projects - like glibc - wouldn't use the kernel headers anyway,
even with some generic language, because they have their own history,
they deal with lots of other non-Linux platforms, and it's just all
downside for them.

In fact, it's all downside for the kernel too. I do *not* want kernel
headers to be used by other projects, because I simply don't want to
hear about "we do Xyz, so the innocuous uapi header change breaks
Abc". It's all pain, for no gain.

So as far as I'm concerned, the uapi header split has been very
successful - but not because other projects can then use our uapi
headers. No, purely because it helped *kernel* people be more careful
about a certain class of changes, and was a big read flag in that it
made people go "Oh, I can't just change that structure, because it's
exported as an API to user space".

If you _really_ want to do a Metalanguage for these things, and want
to support lots of different namespace issues, several different
languages etc, I have a very practical suggestion: make that
metalanguage have a very strict and traditional syntax. Make it look
like C with the C pre-processor.

There are lots of libraries and tools to parse C, and turn it into
other forms. Making up a new language when we already *have* a good
language is all kinds of silly. Just use the language it already is
in, and take advantage of the fact that there's lots of infrastructure
for that language.

                    Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Metalanguage for the Linux UAPI
  2025-05-15 20:26 ` enh
@ 2025-05-15 21:24   ` H. Peter Anvin
  2025-05-16  3:42     ` Willy Tarreau
  2025-05-16 17:27     ` enh
  0 siblings, 2 replies; 11+ messages in thread
From: H. Peter Anvin @ 2025-05-15 21:24 UTC (permalink / raw)
  To: enh; +Cc: Arnd Bergmann, LKML, Linus Torvalds, libc-alpha, linux-arch

On 5/15/25 13:26, enh wrote:
> On Thu, May 15, 2025 at 4:05 PM H. Peter Anvin <hpa@zytor.com> wrote:
>>
>> OK, so this is something I have been thinking about for quite a while.
>> It would be a quite large project, so I would like to hear people's
>> opinions on it before even starting.
>>
>> We have finally succeeded in divorcing the Linux UAPI from the general
>> kernel headers, but even so, there are a lot of things in the UAPI that
>> means it is not possible for an arbitrary libc to use it directly; for
>> example "struct termios" is not the glibc "struct termios", but
>> redefining it breaks the ioctl numbering unless the ioctl headers are
>> changed as well, and so on. However, other libcs want to use the struct
>> termios as defined in the kernel, or, more likely, struct termios2.
> 
> bionic is a ("the only"?) libc that tries to not duplicate _anything_
> and always defer to the uapi headers. we have quite an extensive list
> of hacks we need to apply to rewrite the uapi headers into something
> directly usable (and a lot of awful python to apply those hacks):
> 
> https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/kernel/tools/defaults.py
> 

Not "the only".

> a lot are just name collisions ("you say 'class', my c++ compiler says
> wtf?!"), but there are a few "posix and linux disagree"s too. (other
> libcs that weren't linux-only from day one might have more conflicts,
> such as a comically large sigset_t, say :-) )
> 
> but i think most if not all of that could be fixed upstream, given the will?
> 
> (though some c programmers do still get upset if told they shouldn't
> use c++ keywords as identifiers, i note that the uapi headers _were_
> recently fixed to avoid a c extension that's invalid c++. thanks,
> anyone involved in that who's reading this!)
> 
>> Furthermore, I was looking further into how C++ templates could be used
>> to make user pointers inherently safe and probably more efficient, but
>> ran into the problem that you really want to be able to convert a
>> user-tagged structure to a structure with "safe-user-tagged" members
>> (after access_ok), which turned out not to be trivially supportable even
>> after the latest C++ modernizations (without which I don't consider C++
>> viable at all; I would not consider versions of C++ before C++17 worthy
>> of even looking at; C++20 preferred.)
> 
> (/me assumes you're just trolling linus with this.)

I'm not; I posted a long article about why I think it might be an 
alternative worth pursuing. I know, of course, Linus' long time hatred 
of C++, but as I said: I think *very recent* versions of C++ have a lot 
to offer, mainly in the form of metaprogramming (which we currently do 
using some amazingly ugly macros.)

https://lore.kernel.org/lkml/3465e0c6-f5b2-4c42-95eb-29361481f805@zytor.com

>> And it is not just generation of in-kernel versus out-of-kernel headers
>> that is an issue (which we have managed to deal with pretty well.) There
>> generally isn't enough information in C headers alone to do well at
>> creating bindings for other languages, *especially* given how many
>> constants are defined in terms of macros.
> 
> (yeah, while i think the _c_ [and c++] problems could be solved much
> more easily, solving the swift/rust/golang duplication of all that
> stuff is a whole other thing. i'd try to sign up one of those
> languages' library's maintainers before investing too much in having
> another representation of the uapi though...)

Yes, that's one of the reasons for posting this.

>> The use of C also makes it hard to mangle the headers for user space.
>> For example, glibc has to add __extension__ before anonymous struct or
>> union members in order to be able to compile in strict C90 mode.
> 
> (again, that one seems easily fixable upstream.)

Agreed... until it breaks again. And how much

>> I have been considering if it would make sense to create more of a
>> metalanguage for the Linux UAPI. This would be run through a more
>> advanced preprocessor than cpp written in C and yacc/bison. (It could
>> also be done via a gcc plugin or a DWARF parser, but I do not like tying
>> this to compiler internals, and DWARF parsing is probably more complex
>> and less versatile.)
>>
>> It could thus provide things like "true" constants (constexpr for C++11
>> or C23, or enums), bitfield macro explosions and so on, depending on
>> what the backend user would like: namespacing, distributed enumerations,
>> and assembly offset constants, and even possibly syscall stubs.
> 
> (given a clean slate that wouldn't be terrible, but you get a lot of
> #if nonsense. though the `#define foo foo` trick lets you have the
> best of both worlds [at some cost to compile time].)

Again, that would be a choice for the data consumer (backend), which is 
one of the main advantages here.

	-hpa


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Metalanguage for the Linux UAPI
  2025-05-15 21:20 ` Linus Torvalds
@ 2025-05-15 21:42   ` H. Peter Anvin
  2025-05-15 22:06     ` Linus Torvalds
  0 siblings, 1 reply; 11+ messages in thread
From: H. Peter Anvin @ 2025-05-15 21:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Arnd Bergmann, LKML, libc-alpha, linux-arch

On 5/15/25 14:20, Linus Torvalds wrote:
> 
> If you _really_ want to do a Metalanguage for these things, and want
> to support lots of different namespace issues, several different
> languages etc, I have a very practical suggestion: make that
> metalanguage have a very strict and traditional syntax. Make it look
> like C with the C pre-processor.
> 
> There are lots of libraries and tools to parse C, and turn it into
> other forms. Making up a new language when we already *have* a good
> language is all kinds of silly. Just use the language it already is
> in, and take advantage of the fact that there's lots of infrastructure
> for that language.
> 

Yes, and I looked at using sparse for this purpose. It is not a bad
choice all things considered, but there is definitely metadata that we
simply don't provide.

Building it on top of sparse might still very well be The Right Thing.

It doesn't just affect libc, either; it also affects tools like strace,
sanitizer, and so on. The situation with ioctls is by far the worst.

	-hpa


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Metalanguage for the Linux UAPI
  2025-05-15 21:42   ` H. Peter Anvin
@ 2025-05-15 22:06     ` Linus Torvalds
  0 siblings, 0 replies; 11+ messages in thread
From: Linus Torvalds @ 2025-05-15 22:06 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Arnd Bergmann, LKML, libc-alpha, linux-arch

On Thu, 15 May 2025 at 14:42, H. Peter Anvin <hpa@zytor.com> wrote:
>
> Building it on top of sparse might still very well be The Right Thing.

You can look at the 'ctags.c' file in the sparse code, it basically
does a lot of this kind of thing: it parses the file, then walks
through all the symbols and #defines that it found.

It then obviously prints out filenames and line numbers rather than
converting the result into something else, so I'm not claiming it's
useful as-is, but from a "how to parse a file and walk the symbols it
declares" standpoint it does almost everything.

I just tested, and it looks like it hates the kernel headers because
it hasn't been updated to understand about the bitwise type and dies
with a

    builtin:0:0: error: unknown symbol __le16 namespace:0 type:13

but when I just made it not die it seems to actually do its thing, and
knows about how sparse considers preprocessor symbols to be symbols
just like C symbols are, just in a different namespace.

Obviously for things like ioctl numbers, you'd need to then make it
also actually *evaluate* the #define etc after you list them. You
could do that in sparse itself, or you could do it by just creating a
list of #define's from within sparse, and then have a separate pass
that evaluates their values. That 'ctags' thing doesn't do any of
that, of course.

              Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Metalanguage for the Linux UAPI
  2025-05-15 21:24   ` H. Peter Anvin
@ 2025-05-16  3:42     ` Willy Tarreau
  2025-05-16  4:17       ` H. Peter Anvin
  2025-05-16 17:27     ` enh
  1 sibling, 1 reply; 11+ messages in thread
From: Willy Tarreau @ 2025-05-16  3:42 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: enh, Arnd Bergmann, LKML, Linus Torvalds, libc-alpha, linux-arch

On Thu, May 15, 2025 at 02:24:29PM -0700, H. Peter Anvin wrote:
> On 5/15/25 13:26, enh wrote:
> > On Thu, May 15, 2025 at 4:05 PM H. Peter Anvin <hpa@zytor.com> wrote:
> > > 
> > > OK, so this is something I have been thinking about for quite a while.
> > > It would be a quite large project, so I would like to hear people's
> > > opinions on it before even starting.
> > > 
> > > We have finally succeeded in divorcing the Linux UAPI from the general
> > > kernel headers, but even so, there are a lot of things in the UAPI that
> > > means it is not possible for an arbitrary libc to use it directly; for
> > > example "struct termios" is not the glibc "struct termios", but
> > > redefining it breaks the ioctl numbering unless the ioctl headers are
> > > changed as well, and so on. However, other libcs want to use the struct
> > > termios as defined in the kernel, or, more likely, struct termios2.
> > 
> > bionic is a ("the only"?) libc that tries to not duplicate _anything_
> > and always defer to the uapi headers. we have quite an extensive list
> > of hacks we need to apply to rewrite the uapi headers into something
> > directly usable (and a lot of awful python to apply those hacks):
> > 
> > https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/kernel/tools/defaults.py
> > 
> 
> Not "the only".

Indeed, nolibc (/tools/include/nolibc) directly includes uapi as well, and
since nolibc doesn't compile anything but only exposes include files, these
appear as-is in the application. So far the headers look clean enough for
our use cases and have not caused problems. But admittedly, applications
are small and limited (selftests and init code).

One thing we've been considering which we would find convenient there
would be to generate an indirection layer for all files that would include
the right one depending on the detected arch so as to ease compilation for
any arch with all the uapi files available, as it seems totally feasible
right now (i.e. each .h file would just have "#if defined(__arch_xxx__)
#include <arch_xxx/foo.h>" etc). We could imagine having a
"make install-all-headers" target to produce that thing for example. I'm
sharing this so that you can also have this in mind to consider whether or
not your chosen approach would break that possibility.

Just my two cents,
Willy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Metalanguage for the Linux UAPI
  2025-05-16  3:42     ` Willy Tarreau
@ 2025-05-16  4:17       ` H. Peter Anvin
  2025-05-16  4:22         ` Willy Tarreau
  0 siblings, 1 reply; 11+ messages in thread
From: H. Peter Anvin @ 2025-05-16  4:17 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: enh, Arnd Bergmann, LKML, Linus Torvalds, libc-alpha, linux-arch

On May 15, 2025 8:42:32 PM PDT, Willy Tarreau <w@1wt.eu> wrote:
>On Thu, May 15, 2025 at 02:24:29PM -0700, H. Peter Anvin wrote:
>> On 5/15/25 13:26, enh wrote:
>> > On Thu, May 15, 2025 at 4:05 PM H. Peter Anvin <hpa@zytor.com> wrote:
>> > > 
>> > > OK, so this is something I have been thinking about for quite a while.
>> > > It would be a quite large project, so I would like to hear people's
>> > > opinions on it before even starting.
>> > > 
>> > > We have finally succeeded in divorcing the Linux UAPI from the general
>> > > kernel headers, but even so, there are a lot of things in the UAPI that
>> > > means it is not possible for an arbitrary libc to use it directly; for
>> > > example "struct termios" is not the glibc "struct termios", but
>> > > redefining it breaks the ioctl numbering unless the ioctl headers are
>> > > changed as well, and so on. However, other libcs want to use the struct
>> > > termios as defined in the kernel, or, more likely, struct termios2.
>> > 
>> > bionic is a ("the only"?) libc that tries to not duplicate _anything_
>> > and always defer to the uapi headers. we have quite an extensive list
>> > of hacks we need to apply to rewrite the uapi headers into something
>> > directly usable (and a lot of awful python to apply those hacks):
>> > 
>> > https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/kernel/tools/defaults.py
>> > 
>> 
>> Not "the only".
>
>Indeed, nolibc (/tools/include/nolibc) directly includes uapi as well, and
>since nolibc doesn't compile anything but only exposes include files, these
>appear as-is in the application. So far the headers look clean enough for
>our use cases and have not caused problems. But admittedly, applications
>are small and limited (selftests and init code).
>
>One thing we've been considering which we would find convenient there
>would be to generate an indirection layer for all files that would include
>the right one depending on the detected arch so as to ease compilation for
>any arch with all the uapi files available, as it seems totally feasible
>right now (i.e. each .h file would just have "#if defined(__arch_xxx__)
>#include <arch_xxx/foo.h>" etc). We could imagine having a
>"make install-all-headers" target to produce that thing for example. I'm
>sharing this so that you can also have this in mind to consider whether or
>not your chosen approach would break that possibility.
>
>Just my two cents,
>Willy

Ah yes, nolibc; basically klibc reinvented...

<ducks and runs>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Metalanguage for the Linux UAPI
  2025-05-16  4:17       ` H. Peter Anvin
@ 2025-05-16  4:22         ` Willy Tarreau
  2025-05-16  4:35           ` H. Peter Anvin
  0 siblings, 1 reply; 11+ messages in thread
From: Willy Tarreau @ 2025-05-16  4:22 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: enh, Arnd Bergmann, LKML, Linus Torvalds, libc-alpha, linux-arch

On Thu, May 15, 2025 at 09:17:14PM -0700, H. Peter Anvin wrote:
> Ah yes, nolibc; basically klibc reinvented...
> <ducks and runs>

:-)

That was not the initial intent though as it started separately and outside
the kernel. Also the main difference is that klibc is compiled. Here we
only provide includes so that there's nothing to compile before using it.
We'll see when this becomes an issue, but for now it stands fine.

But I agree that both pursue very similar goals.

Willy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Metalanguage for the Linux UAPI
  2025-05-16  4:22         ` Willy Tarreau
@ 2025-05-16  4:35           ` H. Peter Anvin
  0 siblings, 0 replies; 11+ messages in thread
From: H. Peter Anvin @ 2025-05-16  4:35 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: enh, Arnd Bergmann, LKML, Linus Torvalds, libc-alpha, linux-arch

On May 15, 2025 9:22:46 PM PDT, Willy Tarreau <w@1wt.eu> wrote:
>On Thu, May 15, 2025 at 09:17:14PM -0700, H. Peter Anvin wrote:
>> Ah yes, nolibc; basically klibc reinvented...
>> <ducks and runs>
>
>:-)
>
>That was not the initial intent though as it started separately and outside
>the kernel. Also the main difference is that klibc is compiled. Here we
>only provide includes so that there's nothing to compile before using it.
>We'll see when this becomes an issue, but for now it stands fine.
>
>But I agree that both pursue very similar goals.
>
>Willy

Certainly. I'm being snarky, but I'm not upset :)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Metalanguage for the Linux UAPI
  2025-05-15 21:24   ` H. Peter Anvin
  2025-05-16  3:42     ` Willy Tarreau
@ 2025-05-16 17:27     ` enh
  1 sibling, 0 replies; 11+ messages in thread
From: enh @ 2025-05-16 17:27 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Arnd Bergmann, LKML, Linus Torvalds, libc-alpha, linux-arch

On Thu, May 15, 2025 at 5:24 PM H. Peter Anvin <hpa@zytor.com> wrote:
>
> On 5/15/25 13:26, enh wrote:
> > On Thu, May 15, 2025 at 4:05 PM H. Peter Anvin <hpa@zytor.com> wrote:
> >>
> >> OK, so this is something I have been thinking about for quite a while.
> >> It would be a quite large project, so I would like to hear people's
> >> opinions on it before even starting.
> >>
> >> We have finally succeeded in divorcing the Linux UAPI from the general
> >> kernel headers, but even so, there are a lot of things in the UAPI that
> >> means it is not possible for an arbitrary libc to use it directly; for
> >> example "struct termios" is not the glibc "struct termios", but
> >> redefining it breaks the ioctl numbering unless the ioctl headers are
> >> changed as well, and so on. However, other libcs want to use the struct
> >> termios as defined in the kernel, or, more likely, struct termios2.
> >
> > bionic is a ("the only"?) libc that tries to not duplicate _anything_
> > and always defer to the uapi headers. we have quite an extensive list
> > of hacks we need to apply to rewrite the uapi headers into something
> > directly usable (and a lot of awful python to apply those hacks):
> >
> > https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/kernel/tools/defaults.py
> >
>
> Not "the only".
>
> > a lot are just name collisions ("you say 'class', my c++ compiler says
> > wtf?!"), but there are a few "posix and linux disagree"s too. (other
> > libcs that weren't linux-only from day one might have more conflicts,
> > such as a comically large sigset_t, say :-) )
> >
> > but i think most if not all of that could be fixed upstream, given the will?
> >
> > (though some c programmers do still get upset if told they shouldn't
> > use c++ keywords as identifiers, i note that the uapi headers _were_
> > recently fixed to avoid a c extension that's invalid c++. thanks,
> > anyone involved in that who's reading this!)
> >
> >> Furthermore, I was looking further into how C++ templates could be used
> >> to make user pointers inherently safe and probably more efficient, but
> >> ran into the problem that you really want to be able to convert a
> >> user-tagged structure to a structure with "safe-user-tagged" members
> >> (after access_ok), which turned out not to be trivially supportable even
> >> after the latest C++ modernizations (without which I don't consider C++
> >> viable at all; I would not consider versions of C++ before C++17 worthy
> >> of even looking at; C++20 preferred.)
> >
> > (/me assumes you're just trolling linus with this.)
>
> I'm not; I posted a long article about why I think it might be an
> alternative worth pursuing. I know, of course, Linus' long time hatred
> of C++, but as I said: I think *very recent* versions of C++ have a lot
> to offer, mainly in the form of metaprogramming (which we currently do
> using some amazingly ugly macros.)
>
> https://lore.kernel.org/lkml/3465e0c6-f5b2-4c42-95eb-29361481f805@zytor.com
>
> >> And it is not just generation of in-kernel versus out-of-kernel headers
> >> that is an issue (which we have managed to deal with pretty well.) There
> >> generally isn't enough information in C headers alone to do well at
> >> creating bindings for other languages, *especially* given how many
> >> constants are defined in terms of macros.
> >
> > (yeah, while i think the _c_ [and c++] problems could be solved much
> > more easily, solving the swift/rust/golang duplication of all that
> > stuff is a whole other thing. i'd try to sign up one of those
> > languages' library's maintainers before investing too much in having
> > another representation of the uapi though...)
>
> Yes, that's one of the reasons for posting this.
>
> >> The use of C also makes it hard to mangle the headers for user space.
> >> For example, glibc has to add __extension__ before anonymous struct or
> >> union members in order to be able to compile in strict C90 mode.
> >
> > (again, that one seems easily fixable upstream.)
>
> Agreed... until it breaks again. And how much

that's just an argument for more/better CI though. android's kernel
folks do do abi checking on the uapi headers. there's no theoretical
reason we couldn't do source compatibility checking too, other than
"funding, lack of".

> >> I have been considering if it would make sense to create more of a
> >> metalanguage for the Linux UAPI. This would be run through a more
> >> advanced preprocessor than cpp written in C and yacc/bison. (It could
> >> also be done via a gcc plugin or a DWARF parser, but I do not like tying
> >> this to compiler internals, and DWARF parsing is probably more complex
> >> and less versatile.)
> >>
> >> It could thus provide things like "true" constants (constexpr for C++11
> >> or C23, or enums), bitfield macro explosions and so on, depending on
> >> what the backend user would like: namespacing, distributed enumerations,
> >> and assembly offset constants, and even possibly syscall stubs.
> >
> > (given a clean slate that wouldn't be terrible, but you get a lot of
> > #if nonsense. though the `#define foo foo` trick lets you have the
> > best of both worlds [at some cost to compile time].)
>
> Again, that would be a choice for the data consumer (backend), which is
> one of the main advantages here.
>
>         -hpa
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-05-16 17:27 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-15 20:04 Metalanguage for the Linux UAPI H. Peter Anvin
2025-05-15 20:26 ` enh
2025-05-15 21:24   ` H. Peter Anvin
2025-05-16  3:42     ` Willy Tarreau
2025-05-16  4:17       ` H. Peter Anvin
2025-05-16  4:22         ` Willy Tarreau
2025-05-16  4:35           ` H. Peter Anvin
2025-05-16 17:27     ` enh
2025-05-15 21:20 ` Linus Torvalds
2025-05-15 21:42   ` H. Peter Anvin
2025-05-15 22:06     ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).