rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Weird failure with bpftool when building 6.11-rc4 with clang+rust+lto
@ 2024-08-20 12:02 Neal Gompa
  2024-08-20 16:48 ` Matthew Maurer
  0 siblings, 1 reply; 4+ messages in thread
From: Neal Gompa @ 2024-08-20 12:02 UTC (permalink / raw)
  To: rust-for-linux
  Cc: Miguel Ojeda, Quentin Monnet, bpf, Justin M. Forbes,
	Davide Cavalca, Janne Grunau, Hector Martin, Asahi Linux

Hey all,

While working on enabling Rust in the Fedora kernel[1], we've managed
to get the setup almost completely working, but we have a build
failure with the clang+lto build variant[2][3].

Based on the build failure log[4][5], it looks like there's some
random mixing of Rust inside of C code or something of the sort (which
obviously would be invalid).

Can someone help with this?

Thanks in advance and best regards,

[1]: https://gitlab.com/cki-project/kernel-ark/-/merge_requests/3295
[2]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/index.html?prefix=trusted-artifacts/1419488480/build_x86_64/7618803903/artifacts/
[3]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/index.html?prefix=trusted-artifacts/1419488480/build_aarch64/7618803917/artifacts/
[4]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/1419488480/build_x86_64/7618803903/artifacts/build-failure.log
[5]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/1419488480/build_aarch64/7618803917/artifacts/build-failure.log


-- 
Neal Gompa (FAS: ngompa)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Weird failure with bpftool when building 6.11-rc4 with clang+rust+lto
  2024-08-20 12:02 Weird failure with bpftool when building 6.11-rc4 with clang+rust+lto Neal Gompa
@ 2024-08-20 16:48 ` Matthew Maurer
  2024-08-23 21:45   ` Neal Gompa
  0 siblings, 1 reply; 4+ messages in thread
From: Matthew Maurer @ 2024-08-20 16:48 UTC (permalink / raw)
  To: Neal Gompa
  Cc: rust-for-linux, Miguel Ojeda, Quentin Monnet, bpf,
	Justin M. Forbes, Davide Cavalca, Janne Grunau, Hector Martin,
	Asahi Linux

Sorry that this isn't a solution, but I can tell you some background:

Linux currently relies on the `--lang_exclude` flag to `pahole` to
filter Rust debugging information out of the output BTF. This is done
because various downstream tools (for example, bpftool) do not handle
Rust types correctly. The `--lang_exclude` flag works by checking the
language flag on the compilation unit the type is referenced in. Once
LTO is enabled however, things can migrate from one compilation unit
to another, leading to C code having Rust code and types referenced
inside them. The resulting type will be considered C-language type by
`pahole`, but is actually a Rust type. Even if you fixed bpftool,
without additional patches/hacks, once the kernel boots it would
likely fail to parse its own BTF debugging information, and disable
BPF loading.

The most confusing part to me here is that I only encountered this
issue with x-lang LTO enabled, which is not available in the kernel
you're building from. If this is happening without x-lang LTO enabled,
it likely means that there's another way for debug symbols to leak
across CUs during LTO. That's where I'd start looking - use `pahole`
to dump the contents of `vmlinux.o` and see if you can find a
C-language CU referencing a Rust type. Then, try to figure out how
that's possible. With x-lang LTO it was obvious, inlining caused a
bunch of issues.

The last possibility I can think of is that somehow in your build
configuration `pahole` is not being invoked with the `--lang_exclude`
flag when building `vmlinux`. I don't know why that would be, but it
might be worth double checking.

On Tue, Aug 20, 2024 at 5:13 AM Neal Gompa <ngompa@fedoraproject.org> wrote:
>
> Hey all,
>
> While working on enabling Rust in the Fedora kernel[1], we've managed
> to get the setup almost completely working, but we have a build
> failure with the clang+lto build variant[2][3].
>
> Based on the build failure log[4][5], it looks like there's some
> random mixing of Rust inside of C code or something of the sort (which
> obviously would be invalid).
>
> Can someone help with this?
>
> Thanks in advance and best regards,
>
> [1]: https://gitlab.com/cki-project/kernel-ark/-/merge_requests/3295
> [2]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/index.html?prefix=trusted-artifacts/1419488480/build_x86_64/7618803903/artifacts/
> [3]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/index.html?prefix=trusted-artifacts/1419488480/build_aarch64/7618803917/artifacts/
> [4]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/1419488480/build_x86_64/7618803903/artifacts/build-failure.log
> [5]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/1419488480/build_aarch64/7618803917/artifacts/build-failure.log
>
>
> --
> Neal Gompa (FAS: ngompa)
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Weird failure with bpftool when building 6.11-rc4 with clang+rust+lto
  2024-08-20 16:48 ` Matthew Maurer
@ 2024-08-23 21:45   ` Neal Gompa
  2024-08-31 11:25     ` Neal Gompa
  0 siblings, 1 reply; 4+ messages in thread
From: Neal Gompa @ 2024-08-23 21:45 UTC (permalink / raw)
  To: Matthew Maurer
  Cc: rust-for-linux, Miguel Ojeda, Quentin Monnet, bpf,
	Justin M. Forbes, Davide Cavalca, Janne Grunau, Hector Martin,
	Asahi Linux

Hey Matthew,

The current thinking is that maybe the culprit is dwarves. We've
backported a fix in Fedora that may help, I'm waiting to find out if
it does. Apparently all test runs with Clang+LTO are broken right now
with dwarves 1.27, so it wasn't just unique to my merge request.


On Tue, Aug 20, 2024 at 12:49 PM Matthew Maurer
<matthew.r.maurer@gmail.com> wrote:
>
> Sorry that this isn't a solution, but I can tell you some background:
>
> Linux currently relies on the `--lang_exclude` flag to `pahole` to
> filter Rust debugging information out of the output BTF. This is done
> because various downstream tools (for example, bpftool) do not handle
> Rust types correctly. The `--lang_exclude` flag works by checking the
> language flag on the compilation unit the type is referenced in. Once
> LTO is enabled however, things can migrate from one compilation unit
> to another, leading to C code having Rust code and types referenced
> inside them. The resulting type will be considered C-language type by
> `pahole`, but is actually a Rust type. Even if you fixed bpftool,
> without additional patches/hacks, once the kernel boots it would
> likely fail to parse its own BTF debugging information, and disable
> BPF loading.
>
> The most confusing part to me here is that I only encountered this
> issue with x-lang LTO enabled, which is not available in the kernel
> you're building from. If this is happening without x-lang LTO enabled,
> it likely means that there's another way for debug symbols to leak
> across CUs during LTO. That's where I'd start looking - use `pahole`
> to dump the contents of `vmlinux.o` and see if you can find a
> C-language CU referencing a Rust type. Then, try to figure out how
> that's possible. With x-lang LTO it was obvious, inlining caused a
> bunch of issues.
>
> The last possibility I can think of is that somehow in your build
> configuration `pahole` is not being invoked with the `--lang_exclude`
> flag when building `vmlinux`. I don't know why that would be, but it
> might be worth double checking.
>
> On Tue, Aug 20, 2024 at 5:13 AM Neal Gompa <ngompa@fedoraproject.org> wrote:
> >
> > Hey all,
> >
> > While working on enabling Rust in the Fedora kernel[1], we've managed
> > to get the setup almost completely working, but we have a build
> > failure with the clang+lto build variant[2][3].
> >
> > Based on the build failure log[4][5], it looks like there's some
> > random mixing of Rust inside of C code or something of the sort (which
> > obviously would be invalid).
> >
> > Can someone help with this?
> >
> > Thanks in advance and best regards,
> >
> > [1]: https://gitlab.com/cki-project/kernel-ark/-/merge_requests/3295
> > [2]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/index.html?prefix=trusted-artifacts/1419488480/build_x86_64/7618803903/artifacts/
> > [3]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/index.html?prefix=trusted-artifacts/1419488480/build_aarch64/7618803917/artifacts/
> > [4]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/1419488480/build_x86_64/7618803903/artifacts/build-failure.log
> > [5]: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/1419488480/build_aarch64/7618803917/artifacts/build-failure.log
> >
> >
> > --
> > Neal Gompa (FAS: ngompa)
> >



--
Neal Gompa (FAS: ngompa)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Weird failure with bpftool when building 6.11-rc4 with clang+rust+lto
  2024-08-23 21:45   ` Neal Gompa
@ 2024-08-31 11:25     ` Neal Gompa
  0 siblings, 0 replies; 4+ messages in thread
From: Neal Gompa @ 2024-08-31 11:25 UTC (permalink / raw)
  To: Matthew Maurer
  Cc: rust-for-linux, Miguel Ojeda, Quentin Monnet, bpf,
	Justin M. Forbes, Davide Cavalca, Janne Grunau, Hector Martin,
	Asahi Linux

On Fri, Aug 23, 2024 at 5:45 PM Neal Gompa <ngompa@fedoraproject.org> wrote:
>
> Hey Matthew,
>
> The current thinking is that maybe the culprit is dwarves. We've
> backported a fix in Fedora that may help, I'm waiting to find out if
> it does. Apparently all test runs with Clang+LTO are broken right now
> with dwarves 1.27, so it wasn't just unique to my merge request.
>

Well, that didn't work. The backported fix doesn't seem to have
resolved the problem. I'm at a loss now.



-- 
Neal Gompa (FAS: ngompa)

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-08-31 11:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-20 12:02 Weird failure with bpftool when building 6.11-rc4 with clang+rust+lto Neal Gompa
2024-08-20 16:48 ` Matthew Maurer
2024-08-23 21:45   ` Neal Gompa
2024-08-31 11:25     ` Neal Gompa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).