From: Ralf Jung <post@ralfj.de>
To: Ventura Jack <venturajack85@gmail.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>,
Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>,
Gary Guo <gary@garyguo.net>,
torvalds@linux-foundation.org, airlied@gmail.com,
boqun.feng@gmail.com, david.laight.linux@gmail.com, ej@inai.de,
gregkh@linuxfoundation.org, hch@infradead.org, hpa@zytor.com,
ksummit@lists.linux.dev, linux-kernel@vger.kernel.org,
rust-for-linux@vger.kernel.org
Subject: Re: C aggregate passing (Rust kernel policy)
Date: Thu, 27 Feb 2025 18:58:22 +0100 [thread overview]
Message-ID: <d29ebda1-e6ca-455d-af07-ac1daf84a3d2@ralfj.de> (raw)
In-Reply-To: <CAFJgqgS-S3ZbPfYsA-eJmCXHhMrzwaKW1-G+LegKJNqqGm31UQ@mail.gmail.com>
Hi VJ,
>> I am not aware of a comparable tool that would be in wide-spread use, or that is
>> carefully aligned with the semantics of an actual compiler.
>> For C, there is Cerberus (https://www.cl.cam.ac.uk/~pes20/cerberus/) as an
>> executable version of the C specification, but it can only run tiny examples.
>> The verified CompCert compiler comes with a semantics one could interpret, but
>> that only checks code for compatibility with CompCert C, which has a lot less
>> (and a bit more) UB than real C.
>> There are also two efforts that turned into commercial tools that I have not
>> tried, and for which there is hardly any documentation of how they interpret the
>> C standard so it's not clear what a green light from them means when compiling
>> with gcc or clang. I also don't know how much real-world code they can actually run.
>> - TrustInSoft/tis-interpreter, mostly gone from the web but still available in
>> the wayback machine
>> (https://web.archive.org/web/20200804061411/https://github.com/TrustInSoft/tis-interpreter/);
>> I assume this got integrated into their "TrustInSoft Analyzer" product.
>> - kcc, a K-framework based formalization of C that is executable. The public
>> repo is dead (https://github.com/kframework/c-semantics) and when I tried to
>> build their tool that didn't work. The people behind this have a company that
>> offers "RV-Match" as a commercial product claiming to find bugs in C based on "a
>> complete formal ISO C11 semantics" so I guess that is where their efforts go now.
>>
>> For C++ and Zig, I am not aware of anything comparable.
>>
>> Part of the problem is that in C, 2 people will have 3 ideas for what the
>> standard means. Compiler writers and programmers regularly have wildly
>> conflicting ideas of what is and is not allowed. There are many different places
>> in the standard that have to be scanned to answer "is this well-defined" even
>> for very simple programs. (https://godbolt.org/z/rjaWc6EzG is one of my favorite
>> examples.) A tool can check a single well-defined semantics, but who gets to
>> decide what exactly those semantics are?
>> Formalizing the C standard requires extensive interpretation, so I am skeptical
>> of everyone who claims that they "formalized the C standard" and built a tool on
>> that without extensive evaluation of how their formalization compares to what
>> compilers do and what programmers rely on. The Cerberus people have done that
>> evaluation (see e.g. https://dl.acm.org/doi/10.1145/2980983.2908081), but none
>> of the other efforts have (to my knowledge). Ideally such a formalization effort
>> would be done in close collaboration with compiler authors and the committee so
>> that the ambiguities in the standard can be resolved and the formalization
>> becomes the one canonical interpretation. The Cerberus people are the ones that
>> pushed the C provenance formalization through, so they made great progress here.
>> However, many issues remain, some quite long-standing (e.g.
>> https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm and
>> https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_451.htm, which in my eyes
>> never got properly resolved by clarifying the standard). Martin and a few others
>> are slowly pushing things in the right direction, but it takes a long time.
>> Rust, by having a single project in charge of the one canonical implementation
>> and the specification, and having an open process that is well-suited for
>> incorporating user concerns, can move a lot quicker here. C has a huge
>> head-start, Rust has nothing like the C standard, but we are catching up -- and
>> our goal is more ambitious than that; we are doing our best to learn from C and
>> C++ and concluded that that style of specification is too prone to ambiguity, so
>> we are trying to achieve a formally precise unambiguous specification. Wasm
>> shows that this can be done, at industry scale, albeit for a small language --
>> time we do it for a large one. :)
>>
>> So, yes I think Miri is fairly unique. But please let me know if I missed something!
>>
>> (As an aside, the above hopefully also explains why some people in Rust are
>> concerned about alternative implementations. We do *not* want the current
>> de-factor behavior to ossify and become the specification. We do *not* want the
>> specification to just be a description of what the existing implementations at
>> the time happen to do, and declare all behavior differences to be UB or
>> unspecified or so just because no implementation is willing to adjust their
>> behavior to match the rest. We want the specification to be prescriptive, not
>> descriptive, and we will adjust the implementation as we see fit to achieve that
>> -- within the scope of Rust's stability guarantees. That's also why we are so
>> cagey about spelling out the aliasing rules until we are sure we have a good
>> enough model.)
>
> Very interesting, thank you for the exhaustive answer.
>
> Might it be accurate to categorize Miri as a
> "formal-semantics-based undefined-behavior-detecting interpreter"?
Sure, why not. :)
>
>> https://godbolt.org/z/rjaWc6EzG
>
> That example uses a compiler-specific attribute AFAIK, namely
>
> __attribute__((noinline))
>
> When using compiler-specific attributes and options, the
> original language is arguably no longer being used, depending
> on the attribute. Though a language being inexpressive and
> possibly requiring compiler extensions to achieve some goals,
> possibly like in this C example, can be a disadvantage in itself.
That attribute just exists to make the example small and fit in a single file.
If you user multiple translation units, you can achieve the same effect without
the attribute. Anyway compilers promise (I hope^^) that that particular
attribute has no bearing on whether the code has UB. So, the question of whether
the program without the attribute has UB is still a very interesting one.
At least clang treats this code as having UB, and one can construct a similar
example for gcc. IMO this is not backed by the standard itself, though it can be
considered backed by some defect reports -- but those were for earlier versions
of the standard so technically, they do not apply to C23.
>> [On formalization]
>
> I agree that Rust has some advantages in regards to formalization,
> but some of them that I think of, are different from what you
> mention here. And I also see some disadvantages.
>
> C is an ancient language, and parsing and handling C is made
> more complex by the preprocessor. Rust is a much younger
> language that avoided all that pain, and is easier to parse
> and handle. C++ is way worse, though might become closer
> to Rust with C++ modules.
>
> Rust is more willing to break existing code in projects, causing
> previously compiling projects to no longer compile. rustc does this
> rarely, but it has happened, also long after Rust 1.0.
>
> From last year, 2024.
>
> https://internals.rust-lang.org/t/type-inference-breakage-in-1-80-has-not-been-handled-well/21374
> "Rust 1.80 broke builds of almost all versions of the
> very popular time crate (edit: please don't shoot the
> messenger in that GitHub thread!!!)
>
> Rust has left only a 4-month old version working.
> That was not enough time for the entire Rust
> ecosystem to update, especially that older
> versions of time were not yanked, and users
> had no advance warning that it will stop working.
>
> A crater run found a regression in over 5000 crates,
> and that has just been accepted as okay without
> any further action! This is below the level of stability
> and reliability that Rust should have."
>
> If C was willing to break code as much as Rust, it would be easier to
> clean up C.
Is that true? Gcc updates do break code.
>> [Omitted] We do *not* want the
>> specification to just be a description of what the existing implementations at
>> the time happen to do, and declare all behavior differences to be UB or
>> unspecified or so just because no implementation is willing to adjust their
>> behavior to match the rest. [Omitted]
>
> I have seen some Rust proponents literally say that there is
> a specification for Rust, and that it is called rustc/LLVM.
> Though those specific individuals may not have been the
> most credible individuals.
Maybe don't take the word of random Rust proponents on the internet as anything
more than that. :) I can't speak for the entire Rust project, but I can speak
as lead of the operational semantics team of the Rust project -- no, we do not
consider rustc/LLVM to be a satisfying spec. Producing a proper spec is on the
project agenda.
> A fear I have is that there may be hidden reliance in
> multiple different ways on LLVM, as well as on rustc.
> Maybe even very deeply so. The complexity of Rust's
> type system and rustc's type system checking makes
> me more worried about this point. If there are hidden
> elements, they may turn out to be very difficult to fix,
> especially if they are discovered to be fundamental.
> While having one compiler can be an advantage in
> some ways, it can arguably be a disadvantage
> in some other ways, as you acknowledge as well
> if I understand you correctly.
The Rust type system has absolutely nothing to do with LLVM. Those are
completely separate parts of the compiler. So I don't see any way that LLVM
could possibly influence our type system.
We already discussed previously that indeed, the Rust operational semantics has
a risk of overfitting to LLVM. I acknowledge that.
> You mention ossifying, but the more popular Rust becomes,
> the more painful breakage will be, and the less suited
> Rust will be as a research language.
I do not consider Rust a research language. :)
> Does Crater run Rust for Linux and relevant Rust
> kernel code?
Even better: every single change that lands in Rust checks Rust-for-Linux as
part of our CI.
> I hope that any new language at least has its
> language developers ensure that they have a type
> system that is formalized and proven correct
> before that langauge's 1.0 release.
> Since fixing a type system later can be difficult or
> practically impossible. A complex type system
> and complex type checking can be a larger risk in this
> regard relative to a simple type system and simple
> type checking, especially the more time passes and
> the more the language is used and have code
> written in it, making it more difficult to fix the language
> due to code breakage costing more.
Uff, that's a very high bar to pass.^^ I think there's maybe two languages ever
that meet this bar? SML and wasm.
>>> There are some issues in Rust that I am curious as to
>>> your views on. rustc or the Rust language has some type
>>> system holes, which still causes problems for rustc and
>>> their developers.
>>>
>>> https://github.com/lcnr/solver-woes/issues/1
>>> https://github.com/rust-lang/rust/issues/75992
>>>
>>> Those kinds of issues seem difficult to solve.
>>>
>>> In your opinion, is it accurate to say that the Rust language
>>> developers are working on a new type system for
>>> Rust-the-language and a new solver for rustc, and that
>>> they are trying to make the new type system and new solver
>>> as backwards compatible as possible?
>>
>> It's not really a new type system. It's a new implementation for the same type
>> system. But yes there is work on a new "solver" (that I am not involved in) that
>> should finally fix some of the long-standing type system bugs. Specifically,
>> this is a "trait solver", i.e. it is the component responsible for dealing with
>> trait constraints. Due to some unfortunate corner-case behaviors of the old,
>> organically grown solver, it's very hard to do this in a backwards-compatible
>> way, but we have infrastructure for extensive ecosystem-wide testing to judge
>> the consequences of any given potential breaking change and ensure that almost
>> all existing code keeps working. In fact, Rust 1.84 already started using the
>> new solver for some things
>> (https://blog.rust-lang.org/2025/01/09/Rust-1.84.0.html) -- did you notice?
>> Hopefully not. :)
>
> If it is not a new type system, why then do they talk about
> backwards compatibility for existing Rust projects?
If you make a tiny change to a type system, is it a "new type system"? "new type
system" sounds like "from-scratch redesign". That's not what happens.
> If the type system is not changed, existing projects would
> still type check. And in this repository of one of the main
> Rust language developers as I understand it, several
> issues are labeled with "S-fear".
>
> https://github.com/lcnr/solver-woes/issues
>
> They have also been working on this new solver for
> several years. Reading through the issues, a lot of
> the problems seem very hard.
It is hard, indeed. But last I knew, the types team is confident that they can
pull it off, and I have confidence in them.
Kind regards,
Ralf
next prev parent reply other threads:[~2025-02-27 17:58 UTC|newest]
Thread overview: 196+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-22 10:06 C aggregate passing (Rust kernel policy) Ventura Jack
2025-02-22 14:15 ` Gary Guo
2025-02-22 15:03 ` Ventura Jack
2025-02-22 18:54 ` Kent Overstreet
2025-02-22 19:18 ` Linus Torvalds
2025-02-22 20:00 ` Kent Overstreet
2025-02-22 20:54 ` H. Peter Anvin
2025-02-22 21:22 ` Kent Overstreet
2025-02-22 21:46 ` Linus Torvalds
2025-02-22 22:34 ` Kent Overstreet
2025-02-22 23:56 ` Jan Engelhardt
2025-02-22 22:12 ` David Laight
2025-02-22 22:46 ` Kent Overstreet
2025-02-22 23:50 ` H. Peter Anvin
2025-02-23 0:06 ` Kent Overstreet
2025-02-22 21:22 ` Linus Torvalds
2025-02-23 15:30 ` Ventura Jack
2025-02-23 16:28 ` David Laight
2025-02-24 0:27 ` Gary Guo
2025-02-24 9:57 ` Ventura Jack
2025-02-24 10:31 ` Benno Lossin
2025-02-24 12:21 ` Ventura Jack
2025-02-24 12:47 ` Benno Lossin
2025-02-24 16:57 ` Ventura Jack
2025-02-24 22:03 ` Benno Lossin
2025-02-24 23:04 ` Ventura Jack
2025-02-25 22:38 ` Benno Lossin
2025-02-25 22:47 ` Miguel Ojeda
2025-02-25 23:03 ` Benno Lossin
2025-02-24 12:58 ` Theodore Ts'o
2025-02-24 14:47 ` Miguel Ojeda
2025-02-24 14:54 ` Miguel Ojeda
2025-02-24 16:42 ` Philip Herron
2025-02-25 15:55 ` Ventura Jack
2025-02-25 17:30 ` Arthur Cohen
2025-02-26 11:38 ` Ralf Jung
2025-02-24 15:43 ` Miguel Ojeda
2025-02-24 17:24 ` Kent Overstreet
2025-02-25 16:12 ` Alice Ryhl
2025-02-25 17:21 ` Ventura Jack
2025-02-25 17:36 ` Alice Ryhl
2025-02-25 18:16 ` H. Peter Anvin
2025-02-25 20:21 ` Kent Overstreet
2025-02-25 20:37 ` H. Peter Anvin
2025-02-26 13:03 ` Ventura Jack
2025-02-26 13:53 ` Miguel Ojeda
2025-02-26 14:07 ` Ralf Jung
2025-02-26 14:26 ` James Bottomley
2025-02-26 14:37 ` Ralf Jung
2025-02-26 14:39 ` Greg KH
2025-02-26 14:45 ` James Bottomley
2025-02-26 16:00 ` Steven Rostedt
2025-02-26 16:42 ` James Bottomley
2025-02-26 16:47 ` Kent Overstreet
2025-02-26 16:57 ` Steven Rostedt
2025-02-26 17:41 ` Kent Overstreet
2025-02-26 17:47 ` Steven Rostedt
2025-02-26 22:07 ` Josh Poimboeuf
2025-03-02 12:19 ` David Laight
2025-02-26 17:11 ` Miguel Ojeda
2025-02-26 17:42 ` Kent Overstreet
2025-02-26 12:36 ` Ventura Jack
2025-02-26 13:52 ` Miguel Ojeda
2025-02-26 15:21 ` Ventura Jack
2025-02-26 16:06 ` Ralf Jung
2025-02-26 17:49 ` Miguel Ojeda
2025-02-26 18:36 ` Ventura Jack
2025-02-26 14:14 ` Ralf Jung
2025-02-26 15:40 ` Ventura Jack
2025-02-26 16:10 ` Ralf Jung
2025-02-26 16:50 ` Ventura Jack
2025-02-26 21:39 ` Ralf Jung
2025-02-27 15:11 ` Ventura Jack
2025-02-27 15:32 ` Ralf Jung
2025-02-25 18:54 ` Linus Torvalds
2025-02-25 19:47 ` Kent Overstreet
2025-02-25 20:25 ` Linus Torvalds
2025-02-25 20:55 ` Kent Overstreet
2025-02-25 21:24 ` Linus Torvalds
2025-02-25 23:34 ` Kent Overstreet
2025-02-26 11:57 ` Gary Guo
2025-02-27 14:43 ` Ventura Jack
2025-02-26 14:26 ` Ventura Jack
2025-02-25 22:45 ` Miguel Ojeda
2025-02-26 0:05 ` Miguel Ojeda
2025-02-25 22:42 ` Miguel Ojeda
2025-02-26 14:01 ` Ralf Jung
2025-02-26 13:54 ` Ralf Jung
2025-02-26 17:59 ` Linus Torvalds
2025-02-26 19:01 ` Paul E. McKenney
2025-02-26 20:00 ` Martin Uecker
2025-02-26 21:14 ` Linus Torvalds
2025-02-26 21:21 ` Linus Torvalds
2025-02-26 22:54 ` David Laight
2025-02-27 0:35 ` Paul E. McKenney
2025-02-26 21:26 ` Steven Rostedt
2025-02-26 21:37 ` Steven Rostedt
2025-02-26 21:42 ` Linus Torvalds
2025-02-26 21:56 ` Steven Rostedt
2025-02-26 22:13 ` Steven Rostedt
2025-02-26 22:22 ` Linus Torvalds
2025-02-26 22:35 ` Steven Rostedt
2025-02-26 23:18 ` Linus Torvalds
2025-02-26 23:28 ` Steven Rostedt
2025-02-27 0:04 ` Linus Torvalds
2025-02-27 20:47 ` David Laight
2025-02-27 21:33 ` Steven Rostedt
2025-02-28 21:29 ` Paul E. McKenney
2025-02-27 21:41 ` Paul E. McKenney
2025-02-27 22:20 ` David Laight
2025-02-27 22:40 ` Paul E. McKenney
2025-02-28 7:44 ` Ralf Jung
2025-02-28 15:41 ` Kent Overstreet
2025-02-28 15:46 ` Boqun Feng
2025-02-28 16:04 ` Kent Overstreet
2025-02-28 16:13 ` Boqun Feng
2025-02-28 16:21 ` Kent Overstreet
2025-02-28 16:40 ` Boqun Feng
2025-03-04 18:12 ` Ralf Jung
2025-02-26 22:27 ` Kent Overstreet
2025-02-26 23:16 ` Linus Torvalds
2025-02-27 0:17 ` Kent Overstreet
2025-02-27 0:26 ` comex
2025-02-27 18:33 ` Ralf Jung
2025-02-27 19:15 ` Linus Torvalds
2025-02-27 19:55 ` Kent Overstreet
2025-02-27 20:28 ` Linus Torvalds
2025-02-28 7:53 ` Ralf Jung
2025-03-06 19:16 ` Ventura Jack
2025-02-27 4:18 ` Martin Uecker
2025-02-27 5:52 ` Linus Torvalds
2025-02-27 6:56 ` Martin Uecker
2025-02-27 14:29 ` Steven Rostedt
2025-02-27 17:35 ` Paul E. McKenney
2025-02-27 18:13 ` Kent Overstreet
2025-02-27 19:10 ` Paul E. McKenney
2025-02-27 18:00 ` Ventura Jack
2025-02-27 18:44 ` Ralf Jung
2025-02-27 14:21 ` Ventura Jack
2025-02-27 15:27 ` H. Peter Anvin
2025-02-28 8:08 ` Ralf Jung
2025-02-28 8:32 ` Martin Uecker
2025-02-26 20:25 ` Kent Overstreet
2025-02-26 20:34 ` Andy Lutomirski
2025-02-26 22:45 ` David Laight
2025-02-22 19:41 ` Miguel Ojeda
2025-02-22 20:49 ` Kent Overstreet
2025-02-26 11:34 ` Ralf Jung
2025-02-26 14:57 ` Ventura Jack
2025-02-26 16:32 ` Ralf Jung
2025-02-26 18:09 ` Ventura Jack
2025-02-26 22:28 ` Ralf Jung
2025-02-26 23:08 ` David Laight
2025-02-27 13:55 ` Ralf Jung
2025-02-27 17:33 ` Ventura Jack
2025-02-27 17:58 ` Ralf Jung [this message]
2025-02-27 19:06 ` Ventura Jack
2025-02-27 19:45 ` Ralf Jung
2025-02-27 20:22 ` Kent Overstreet
2025-02-27 22:18 ` David Laight
2025-02-27 23:18 ` Kent Overstreet
2025-02-28 7:38 ` Ralf Jung
2025-02-28 20:48 ` Ventura Jack
2025-02-28 20:41 ` Ventura Jack
2025-02-28 22:13 ` Geoffrey Thomas
2025-03-01 14:19 ` Ventura Jack
2025-03-04 18:24 ` Ralf Jung
2025-03-06 18:49 ` Ventura Jack
2025-02-27 17:58 ` Miguel Ojeda
2025-02-27 19:25 ` Ventura Jack
2025-02-26 19:07 ` Martin Uecker
2025-02-26 19:23 ` Ralf Jung
2025-02-26 20:22 ` Martin Uecker
[not found] <CAFJgqgRZ1w0ONj2wbcczx2=boXYHoLOd=-ke7tHGBAcifSfPUw@mail.gmail.com>
2025-02-25 15:42 ` H. Peter Anvin
2025-02-25 16:45 ` Ventura Jack
[not found] <CANiq72m-R0tOakf=j7BZ78jDHdy=9-fvZbAT8j91Je2Bxy0sFg@mail.gmail.com>
2025-02-18 16:08 ` Rust kernel policy Christoph Hellwig
2025-02-18 18:46 ` Miguel Ojeda
2025-02-18 21:49 ` H. Peter Anvin
2025-02-18 22:54 ` Miguel Ojeda
2025-02-19 0:58 ` H. Peter Anvin
2025-02-19 3:04 ` Boqun Feng
2025-02-19 5:39 ` Greg KH
2025-02-20 12:28 ` Jan Engelhardt
2025-02-20 12:37 ` Greg KH
2025-02-20 13:23 ` H. Peter Anvin
2025-02-20 15:17 ` C aggregate passing (Rust kernel policy) Jan Engelhardt
2025-02-20 16:46 ` Linus Torvalds
2025-02-20 20:34 ` H. Peter Anvin
2025-02-21 8:31 ` HUANG Zhaobin
2025-02-21 18:34 ` David Laight
2025-02-21 19:12 ` Linus Torvalds
2025-02-21 20:07 ` comex
2025-02-21 21:45 ` David Laight
2025-02-22 6:32 ` Willy Tarreau
2025-02-22 6:37 ` Willy Tarreau
2025-02-22 8:41 ` David Laight
2025-02-22 9:11 ` Willy Tarreau
2025-02-21 20:06 ` Jan Engelhardt
2025-02-21 20:23 ` Laurent Pinchart
2025-02-21 20:24 ` Laurent Pinchart
2025-02-21 22:02 ` David Laight
2025-02-21 22:13 ` Bart Van Assche
2025-02-22 5:56 ` comex
2025-02-21 20:26 ` Linus Torvalds
2025-02-21 22:19 ` henrychurchill
2025-02-21 22:52 ` henrychurchill
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d29ebda1-e6ca-455d-af07-ac1daf84a3d2@ralfj.de \
--to=post@ralfj.de \
--cc=airlied@gmail.com \
--cc=boqun.feng@gmail.com \
--cc=david.laight.linux@gmail.com \
--cc=ej@inai.de \
--cc=gary@garyguo.net \
--cc=gregkh@linuxfoundation.org \
--cc=hch@infradead.org \
--cc=hpa@zytor.com \
--cc=kent.overstreet@linux.dev \
--cc=ksummit@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=miguel.ojeda.sandonis@gmail.com \
--cc=rust-for-linux@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=venturajack85@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox