BPF List
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Yonghong Song <yhs@fb.com>, bpf <bpf@vger.kernel.org>
Subject: Re: Latest libbpf fails to load programs compiled with old LLVM
Date: Tue, 08 Dec 2020 14:41:28 +0100	[thread overview]
Message-ID: <87czzkifef.fsf@toke.dk> (raw)
In-Reply-To: <CAEf4Bzb=zi6ew3fgAg29ZB0tcBw8xfEX-pRuMeAyYBiXp5ewTw@mail.gmail.com>

Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:

> On Mon, Dec 7, 2020 at 3:00 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>>
>> > On Fri, Dec 4, 2020 at 9:55 AM Yonghong Song <yhs@fb.com> wrote:
>> >>
>> >>
>> >>
>> >> On 12/4/20 1:34 AM, Toke Høiland-Jørgensen wrote:
>> >> > Yonghong Song <yhs@fb.com> writes:
>> >> >
>> >> >> On 12/3/20 9:55 AM, Toke Høiland-Jørgensen wrote:
>> >> >>> Hi Andrii
>> >> >>>
>> >> >>> I noticed that recent libbpf versions fail to load BPF files compiled
>> >> >>> with old versions of LLVM. E.g., if I compile xdp-tools with LLVM 7 I
>> >> >>> get:
>> >> >>>
>> >> >>> $ sudo ./xdp-loader load testns ../lib/testing/xdp_drop.o -vv
>> >> >>> Loading 1 files on interface 'testns'.
>> >> >>> libbpf: loading ../lib/testing/xdp_drop.o
>> >> >>> libbpf: elf: section(3) prog, size 16, link 0, flags 6, type=1
>> >> >>> libbpf: sec 'prog': failed to find program symbol at offset 0
>> >> >>> Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid
>> >> >>>
>> >> >>> The 'failed to find program symbol' error seems to have been introduced
>> >> >>> with commit c112239272c6 ("libbpf: Parse multi-function sections into
>> >> >>> multiple BPF programs").
>> >> >>>
>> >> >>> Looking at the object file in question, indeed it seems to not have any
>> >> >>> function symbols defined:
>> >> >>>
>> >> >>> $  llvm-objdump --syms ../lib/testing/xdp_drop.o
>> >> >>>
>> >> >>> ../lib/testing/xdp_drop.o:  file format elf64-bpf
>> >> >>>
>> >> >>> SYMBOL TABLE:
>> >> >>> 0000000000000000 l       .debug_str 0000000000000000
>> >> >>> 0000000000000037 l       .debug_str 0000000000000000
>> >> >>> 0000000000000042 l       .debug_str 0000000000000000
>> >> >>> 0000000000000068 l       .debug_str 0000000000000000
>> >> >>> 0000000000000071 l       .debug_str 0000000000000000
>> >> >>> 0000000000000076 l       .debug_str 0000000000000000
>> >> >>> 000000000000008a l       .debug_str 0000000000000000
>> >> >>> 0000000000000097 l       .debug_str 0000000000000000
>> >> >>> 00000000000000a3 l       .debug_str 0000000000000000
>> >> >>> 00000000000000ac l       .debug_str 0000000000000000
>> >> >>> 00000000000000b5 l       .debug_str 0000000000000000
>> >> >>> 00000000000000bc l       .debug_str 0000000000000000
>> >> >>> 00000000000000c9 l       .debug_str 0000000000000000
>> >> >>> 00000000000000d4 l       .debug_str 0000000000000000
>> >> >>> 00000000000000dd l       .debug_str 0000000000000000
>> >> >>> 00000000000000e1 l       .debug_str 0000000000000000
>> >> >>> 00000000000000e5 l       .debug_str 0000000000000000
>> >> >>> 00000000000000ea l       .debug_str 0000000000000000
>> >> >>> 00000000000000f0 l       .debug_str 0000000000000000
>> >> >>> 00000000000000f9 l       .debug_str 0000000000000000
>> >> >>> 0000000000000103 l       .debug_str 0000000000000000
>> >> >>> 0000000000000113 l       .debug_str 0000000000000000
>> >> >>> 0000000000000122 l       .debug_str 0000000000000000
>> >> >>> 0000000000000131 l       .debug_str 0000000000000000
>> >> >>> 0000000000000000 l    d  prog       0000000000000000 prog
>> >> >>> 0000000000000000 l    d  .debug_abbrev      0000000000000000 .debug_abbrev
>> >> >>> 0000000000000000 l    d  .debug_info        0000000000000000 .debug_info
>> >> >>> 0000000000000000 l    d  .debug_frame       0000000000000000 .debug_frame
>> >> >>> 0000000000000000 l    d  .debug_line        0000000000000000 .debug_line
>> >> >>> 0000000000000000 g       license    0000000000000000 _license
>> >> >>> 0000000000000000 g       prog       0000000000000000 xdp_drop
>> >> >>>
>> >> >>>
>> >> >>> I assume this is because old LLVM versions simply don't emit that symbol
>> >> >>> information?
>> >>
>> >> Thanks for the below instruction and xdp_drop.c file. I can reproduce
>> >> the issue now.
>> >>
>> >> I added another function 'xdp_drop1' in the same thing. Below is the
>> >> symbol table with llvm7 vs. llvm12.
>> >>
>> >> -bash-4.4$ llvm-readelf -symbols xdp-7.o | grep xdp_drop
>> >>      32: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT     3 xdp_drop
>> >>      33: 0000000000000010     0 NOTYPE  GLOBAL DEFAULT     3 xdp_drop1
>> >>
>> >>    [ 3] prog              PROGBITS        0000000000000000 000040 000020
>> >> 00  AX  0   0  8
>> >>
>> >> -bash-4.4$ llvm-readelf -symbols xdp-12.o | grep xdp_drop
>> >>      32: 0000000000000000    16 FUNC    GLOBAL DEFAULT     3 xdp_drop
>> >>      33: 0000000000000010    16 FUNC    GLOBAL DEFAULT     3 xdp_drop1
>> >> -bash-4.4$
>> >>
>> >>    [ 3] prog              PROGBITS        0000000000000000 000040 000020
>> >> 00  AX  0   0  8
>> >>
>> >>
>> >> Yes, llvm7 does not encode type and size for FUNC's. I guess libbpf can
>> >> change to recognize NOTYPE and use the symbol value (representing the
>> >> offset from the start of the section) and section size to
>> >> calculate the individual function size. This is more complicated than
>> >> elf file providing FUNC type and symbol size directly.
>> >
>> > I think we should just face the fact that LLVM7 is way too old to
>> > produce a sensible BPF ELF file layout. We can extend:
>> >
>> > libbpf: sec 'prog': failed to find program symbol at offset 0
>> > Couldn't open file '../lib/testing/xdp_drop.o': BPF object format invalid
>> >
>> > with a suggestion to upgrade Clang/LLVM to something more recent, if
>> > that would be helpful.
>> >
>> > But I don't want to add error-prone checks and assumptions in the
>> > already quite complicated logic. Even the kernel itself maintains that
>> > Clang 10+ needs to be used for its compilation. BPF CO-RE is also not
>> > working with older than Clang10, so lots of people have already
>> > upgraded way beyond that.
>>
>> Wait, what? This is a regression that *breaks people's programs* on
>> compiler versions that are still very much in the wild! I mean, fine if
>> you don't want to support new features on such files, but then surely we
>> can at least revert back to the old behaviour?
>
> This is clearly a bug in LLVM7, which didn't produce correct ELF
> symbols, do we agree on that? libbpf used to handle such invalid ELF
> files *by accident* until it changed its internal logic to be more
> strict in v0.2. It became more strict and doesn't work with such
> invalid ELF files anymore. Does it need to add extra quirks to support
> such broken ELF? I don't think so.

I don't know enough about the intricacies of the ELF format to say, but
I believe you when you say it's a bug. However, that doesn't change the
fact that from a user's PoV, something that was working before is now
broken, with the only change being a newer libbpf.

This is not a theoretical concern, BTW, I discovered this due to
feedback from a partner that we've been pushing to adopt libbpf. When
they finally tried it out, the first thing they noticed is that their
programs wouldn't load due to this issue.

Sure, I can tell them to just upgrade their toolchain (and I will), but
that still means we're back to "in order to use this library, you should
expect to keep chasing the latest version of the entire toolchain". And
this is a much harder sell than "this is a stable library and upstream
takes backwards compatibility very serious", which I *thought* was the
expectation.

> Surely, users that can't upgrade LLVM7 to something less ancient, can
> stick to libbpf v0.1, that was lenient enough to accept such invalid
> ELF files. libbpf v0.2 was released more than a month ago, and so far
> you are the only one who noticed this "regression". So hopefully it's
> not super annoying to people and they would be accommodating enough to
> use more up to date compiler (and save themselves lots of trouble
> along the way).

Oh, boy, do I envy your adoption rate for new versions! In my world I
would expect that by one month a few people who are very early adopters
have started noticing and maybe thinking about testing the new version :)

>> > Speaking of legacy. Toke, can you please update all the samples in
>> > your xdp-tools repo to not use arbitrary sections names. I see
>> > SEC("prog"), where it should really be SEC("xdp"). It sets a bad
>> > example for newcomers, IMO.
>>
>> I used "prog" because that's what iproute2 looks for if you don't supply
>
> Ok.

Fixed now, BTW:
https://github.com/xdp-project/xdp-tools/commit/83ab8aa1c29408aac842bebe704aa47ec5dc5bc3

>> a section name, so it makes it convenient to load programs with 'ip'
>> without supplying the section name. However, I do realise this is not
>> the best of reasons, and I am not opposed to changing it. However...
>>
>> > I'm also going to emit warnings in libbpf soon for section names that
>> > don't follow proper libbpf naming pattern, so it would be good if you
>> > could get ahead of the curve.
>>
>> ...this sounds like just another way to annoy users by breaking things
>> that were working before? :/
>
> It won't break, libbpf will emit a warning about the need to use
> proper section name format, which will start to be enforced only with
> major version bump. So that will give users plenty of time to make
> sure their BPF programs are compatible with stricter libbpf.

Well see above re: different expectations for "plenty of time". But OK,
maybe this isn't as bad as I figured at first glance :)

-Toke


  reply	other threads:[~2020-12-08 13:43 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-03 17:55 Latest libbpf fails to load programs compiled with old LLVM Toke Høiland-Jørgensen
2020-12-04  1:42 ` Yonghong Song
2020-12-04  9:34   ` Toke Høiland-Jørgensen
2020-12-04 17:54     ` Yonghong Song
2020-12-04 19:23       ` Andrii Nakryiko
2020-12-07 10:59         ` Toke Høiland-Jørgensen
2020-12-07 15:55           ` Alexei Starovoitov
2020-12-07 16:15             ` Toke Høiland-Jørgensen
2020-12-07 16:20               ` Alexei Starovoitov
2020-12-07 16:51                 ` Toke Høiland-Jørgensen
2020-12-07 17:16                   ` Alexei Starovoitov
2020-12-07 22:18                     ` Toke Høiland-Jørgensen
2020-12-08  1:20                       ` Alexei Starovoitov
2020-12-07 18:02                 ` David Ahern
2020-12-07 18:14                   ` Alexei Starovoitov
2020-12-07 18:59                     ` David Ahern
2020-12-08  2:47           ` Andrii Nakryiko
2020-12-08 13:41             ` Toke Høiland-Jørgensen [this message]
2020-12-08 18:39               ` Andrii Nakryiko
2020-12-08 22:38                 ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czzkifef.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox