* Weird failure with bpftool when building 6.11-rc4 with clang+rust+lto @ 2024-08-20 12:02 Neal Gompa 2024-08-20 16:48 ` Matthew Maurer 0 siblings, 1 reply; 4+ messages in thread From: Neal Gompa @ 2024-08-20 12:02 UTC (permalink / raw) To: rust-for-linux Cc: Miguel Ojeda, Quentin Monnet, bpf, Justin M. Forbes, Davide Cavalca, Janne Grunau, Hector Martin, Asahi Linux Hey all, While working on enabling Rust in the Fedora kernel[1], we've managed to get the setup almost completely working, but we have a build failure with the clang+lto build variant[2][3]. Based on the build failure log[4][5], it looks like there's some random mixing of Rust inside of C code or something of the sort (which obviously would be invalid). Can someone help with this? Thanks in advance and best regards, [1]: https://gitlab.com/cki-project/kernel-ark/-/merge_requests/3295 [2]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/index.html?prefix=trusted-artifacts/1419488480/build_x86_64/7618803903/artifacts/ [3]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/index.html?prefix=trusted-artifacts/1419488480/build_aarch64/7618803917/artifacts/ [4]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/1419488480/build_x86_64/7618803903/artifacts/build-failure.log [5]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/1419488480/build_aarch64/7618803917/artifacts/build-failure.log -- Neal Gompa (FAS: ngompa) ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Weird failure with bpftool when building 6.11-rc4 with clang+rust+lto 2024-08-20 12:02 Weird failure with bpftool when building 6.11-rc4 with clang+rust+lto Neal Gompa @ 2024-08-20 16:48 ` Matthew Maurer 2024-08-23 21:45 ` Neal Gompa 0 siblings, 1 reply; 4+ messages in thread From: Matthew Maurer @ 2024-08-20 16:48 UTC (permalink / raw) To: Neal Gompa Cc: rust-for-linux, Miguel Ojeda, Quentin Monnet, bpf, Justin M. Forbes, Davide Cavalca, Janne Grunau, Hector Martin, Asahi Linux Sorry that this isn't a solution, but I can tell you some background: Linux currently relies on the `--lang_exclude` flag to `pahole` to filter Rust debugging information out of the output BTF. This is done because various downstream tools (for example, bpftool) do not handle Rust types correctly. The `--lang_exclude` flag works by checking the language flag on the compilation unit the type is referenced in. Once LTO is enabled however, things can migrate from one compilation unit to another, leading to C code having Rust code and types referenced inside them. The resulting type will be considered C-language type by `pahole`, but is actually a Rust type. Even if you fixed bpftool, without additional patches/hacks, once the kernel boots it would likely fail to parse its own BTF debugging information, and disable BPF loading. The most confusing part to me here is that I only encountered this issue with x-lang LTO enabled, which is not available in the kernel you're building from. If this is happening without x-lang LTO enabled, it likely means that there's another way for debug symbols to leak across CUs during LTO. That's where I'd start looking - use `pahole` to dump the contents of `vmlinux.o` and see if you can find a C-language CU referencing a Rust type. Then, try to figure out how that's possible. With x-lang LTO it was obvious, inlining caused a bunch of issues. The last possibility I can think of is that somehow in your build configuration `pahole` is not being invoked with the `--lang_exclude` flag when building `vmlinux`. I don't know why that would be, but it might be worth double checking. On Tue, Aug 20, 2024 at 5:13 AM Neal Gompa <ngompa@fedoraproject.org> wrote: > > Hey all, > > While working on enabling Rust in the Fedora kernel[1], we've managed > to get the setup almost completely working, but we have a build > failure with the clang+lto build variant[2][3]. > > Based on the build failure log[4][5], it looks like there's some > random mixing of Rust inside of C code or something of the sort (which > obviously would be invalid). > > Can someone help with this? > > Thanks in advance and best regards, > > [1]: https://gitlab.com/cki-project/kernel-ark/-/merge_requests/3295 > [2]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/index.html?prefix=trusted-artifacts/1419488480/build_x86_64/7618803903/artifacts/ > [3]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/index.html?prefix=trusted-artifacts/1419488480/build_aarch64/7618803917/artifacts/ > [4]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/1419488480/build_x86_64/7618803903/artifacts/build-failure.log > [5]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/1419488480/build_aarch64/7618803917/artifacts/build-failure.log > > > -- > Neal Gompa (FAS: ngompa) > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Weird failure with bpftool when building 6.11-rc4 with clang+rust+lto 2024-08-20 16:48 ` Matthew Maurer @ 2024-08-23 21:45 ` Neal Gompa 2024-08-31 11:25 ` Neal Gompa 0 siblings, 1 reply; 4+ messages in thread From: Neal Gompa @ 2024-08-23 21:45 UTC (permalink / raw) To: Matthew Maurer Cc: rust-for-linux, Miguel Ojeda, Quentin Monnet, bpf, Justin M. Forbes, Davide Cavalca, Janne Grunau, Hector Martin, Asahi Linux Hey Matthew, The current thinking is that maybe the culprit is dwarves. We've backported a fix in Fedora that may help, I'm waiting to find out if it does. Apparently all test runs with Clang+LTO are broken right now with dwarves 1.27, so it wasn't just unique to my merge request. On Tue, Aug 20, 2024 at 12:49 PM Matthew Maurer <matthew.r.maurer@gmail.com> wrote: > > Sorry that this isn't a solution, but I can tell you some background: > > Linux currently relies on the `--lang_exclude` flag to `pahole` to > filter Rust debugging information out of the output BTF. This is done > because various downstream tools (for example, bpftool) do not handle > Rust types correctly. The `--lang_exclude` flag works by checking the > language flag on the compilation unit the type is referenced in. Once > LTO is enabled however, things can migrate from one compilation unit > to another, leading to C code having Rust code and types referenced > inside them. The resulting type will be considered C-language type by > `pahole`, but is actually a Rust type. Even if you fixed bpftool, > without additional patches/hacks, once the kernel boots it would > likely fail to parse its own BTF debugging information, and disable > BPF loading. > > The most confusing part to me here is that I only encountered this > issue with x-lang LTO enabled, which is not available in the kernel > you're building from. If this is happening without x-lang LTO enabled, > it likely means that there's another way for debug symbols to leak > across CUs during LTO. That's where I'd start looking - use `pahole` > to dump the contents of `vmlinux.o` and see if you can find a > C-language CU referencing a Rust type. Then, try to figure out how > that's possible. With x-lang LTO it was obvious, inlining caused a > bunch of issues. > > The last possibility I can think of is that somehow in your build > configuration `pahole` is not being invoked with the `--lang_exclude` > flag when building `vmlinux`. I don't know why that would be, but it > might be worth double checking. > > On Tue, Aug 20, 2024 at 5:13 AM Neal Gompa <ngompa@fedoraproject.org> wrote: > > > > Hey all, > > > > While working on enabling Rust in the Fedora kernel[1], we've managed > > to get the setup almost completely working, but we have a build > > failure with the clang+lto build variant[2][3]. > > > > Based on the build failure log[4][5], it looks like there's some > > random mixing of Rust inside of C code or something of the sort (which > > obviously would be invalid). > > > > Can someone help with this? > > > > Thanks in advance and best regards, > > > > [1]: https://gitlab.com/cki-project/kernel-ark/-/merge_requests/3295 > > [2]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/index.html?prefix=trusted-artifacts/1419488480/build_x86_64/7618803903/artifacts/ > > [3]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/index.html?prefix=trusted-artifacts/1419488480/build_aarch64/7618803917/artifacts/ > > [4]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/1419488480/build_x86_64/7618803903/artifacts/build-failure.log > > [5]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/1419488480/build_aarch64/7618803917/artifacts/build-failure.log > > > > > > -- > > Neal Gompa (FAS: ngompa) > > -- Neal Gompa (FAS: ngompa) ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Weird failure with bpftool when building 6.11-rc4 with clang+rust+lto 2024-08-23 21:45 ` Neal Gompa @ 2024-08-31 11:25 ` Neal Gompa 0 siblings, 0 replies; 4+ messages in thread From: Neal Gompa @ 2024-08-31 11:25 UTC (permalink / raw) To: Matthew Maurer Cc: rust-for-linux, Miguel Ojeda, Quentin Monnet, bpf, Justin M. Forbes, Davide Cavalca, Janne Grunau, Hector Martin, Asahi Linux On Fri, Aug 23, 2024 at 5:45 PM Neal Gompa <ngompa@fedoraproject.org> wrote: > > Hey Matthew, > > The current thinking is that maybe the culprit is dwarves. We've > backported a fix in Fedora that may help, I'm waiting to find out if > it does. Apparently all test runs with Clang+LTO are broken right now > with dwarves 1.27, so it wasn't just unique to my merge request. > Well, that didn't work. The backported fix doesn't seem to have resolved the problem. I'm at a loss now. -- Neal Gompa (FAS: ngompa) ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-08-31 11:26 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-08-20 12:02 Weird failure with bpftool when building 6.11-rc4 with clang+rust+lto Neal Gompa 2024-08-20 16:48 ` Matthew Maurer 2024-08-23 21:45 ` Neal Gompa 2024-08-31 11:25 ` Neal Gompa
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).