netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stanislav Fomichev <sdf@fomichev.me>
To: acme@kernel.org, bpf@vger.kernel.org
Cc: netdev@vger.kernel.org, daniel@iogearbox.net, ast@fb.com,
	andriin@fb.com, morbo@google.com
Subject: pahole and LTO
Date: Fri, 10 Jan 2020 08:44:10 -0800	[thread overview]
Message-ID: <20200110164410.GA1075235@mini-arch> (raw)

tl;dr - building the kernel with clang and lto breaks BTF generation because
pahole doesn't seem to understand cross-cu references.

Can be reproduced with the following:
    $ cat a.c
    struct s;

    void f1() {}

    __attribute__((always_inline)) void f2(struct s *p)
    {
            if (p)
                    f1();
    }
    $ cat b.c
    struct s {
            int x;
    };

    void f2(struct s *p);

    int main()
    {
            struct s s = { 10 };
            f2(&s);
    }
    $ clang -fuse-ld=lld -flto {a,b}.c -g

    $ pahole a.out
    tag__recode_dwarf_type: couldn't find 0x3f type for 0x99 (inlined_subroutine)!
    lexblock__recode_dwarf_types: couldn't find 0x3f type for 0x99 (inlined_subroutine)!
    struct s {
            int                        x;                    /*     0     4 */

            /* size: 4, cachelines: 1, members: 1 */
            /* last cacheline: 4 bytes */
    };

From what I can tell, pahole internally loops over each cu and resolves only
local references, while the dwarf spec (table 2.3) states the following
about 'reference':
"Refers to one of the debugging information entries that describe the program.
There are four types of reference. The first is an offset relative to the
beginning of the compilation unit in which the reference occurs and must
refer to an entry within that same compilation unit. The second type of
reference is the offset of a debugging information entry in any compilation
unit, including one different from the unit containing the reference. The
third type of reference is an indirect reference to a type definition using
an 8-byte signature for that type. The fourth type of reference is a reference
from within the .debug_info section of the executable or shared object file to
a debugging information entry in the .debug_info section of a supplementary
object file."

In particular: "The second type of reference is the offset of a debugging
information entry in any compilation unit, including one different from the
unit containing the reference."


So the question is: is it a (known) issue? Is it something that's ommitted
on purpose? Or it's not implemented because lto is not (yet) widely used?


Here is the dwarf:

$ readelf --debug-dump=info a.out
Contents of the .debug_info section:

  Compilation Unit @ offset 0x0:
   Length:        0x44 (32-bit)
   Version:       4
   Abbrev Offset: 0x0
   Pointer Size:  8
 <0><b>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <c>   DW_AT_producer    : (indirect string, offset: 0x11): clang version 10.0.0 (https://github.com/llvm/llvm-project.git 5fe4679cc9cfb4941b766db07bf3cd928075d204)
    <10>   DW_AT_language    : 12	(ANSI C99)
    <12>   DW_AT_name        : (indirect string, offset: 0x0): a.c
    <16>   DW_AT_stmt_list   : 0x0
    <1a>   DW_AT_comp_dir    : (indirect string, offset: 0x7a): /usr/local/google/home/sdf/tmp/lto
    <1e>   DW_AT_low_pc      : 0x201730
    <26>   DW_AT_high_pc     : 0x6
 <1><2a>: Abbrev Number: 2 (DW_TAG_subprogram)
    <2b>   DW_AT_low_pc      : 0x201730
    <33>   DW_AT_high_pc     : 0x6
    <37>   DW_AT_frame_base  : 1 byte block: 56 	(DW_OP_reg6 (rbp))
    <39>   DW_AT_name        : (indirect string, offset: 0xa4): f1
    <3d>   DW_AT_decl_file   : 1
    <3e>   DW_AT_decl_line   : 3
    <3f>   DW_AT_external    : 1
 <1><3f>: Abbrev Number: 3 (DW_TAG_subprogram)
    <40>   DW_AT_name        : (indirect string, offset: 0x4): f2
    <44>   DW_AT_decl_file   : 1
    <45>   DW_AT_decl_line   : 5
    <46>   DW_AT_prototyped  : 1
    <46>   DW_AT_external    : 1
    <46>   DW_AT_inline      : 1	(inlined)
 <1><47>: Abbrev Number: 0
  Compilation Unit @ offset 0x48:
   Length:        0x7f (32-bit)
   Version:       4
   Abbrev Offset: 0x0
   Pointer Size:  8
 <0><53>: Abbrev Number: 1 (DW_TAG_compile_unit)
    <54>   DW_AT_producer    : (indirect string, offset: 0x11): clang version 10.0.0 (https://github.com/llvm/llvm-project.git 5fe4679cc9cfb4941b766db07bf3cd928075d204)
    <58>   DW_AT_language    : 12	(ANSI C99)
    <5a>   DW_AT_name        : (indirect string, offset: 0x7): b.c
    <5e>   DW_AT_stmt_list   : 0x3a
    <62>   DW_AT_comp_dir    : (indirect string, offset: 0x7a): /usr/local/google/home/sdf/tmp/lto
    <66>   DW_AT_low_pc      : 0x201740
    <6e>   DW_AT_high_pc     : 0x1f
 <1><72>: Abbrev Number: 4 (DW_TAG_subprogram)
    <73>   DW_AT_low_pc      : 0x201740
    <7b>   DW_AT_high_pc     : 0x1f
    <7f>   DW_AT_frame_base  : 1 byte block: 56 	(DW_OP_reg6 (rbp))
    <81>   DW_AT_name        : (indirect string, offset: 0x9d): main
    <85>   DW_AT_decl_file   : 1
    <86>   DW_AT_decl_line   : 7
    <87>   DW_AT_type        : <0xae>
    <8b>   DW_AT_external    : 1
 <2><8b>: Abbrev Number: 5 (DW_TAG_variable)
    <8c>   DW_AT_location    : 2 byte block: 91 78 	(DW_OP_fbreg: -8)
    <8f>   DW_AT_name        : (indirect string, offset: 0xb): s
    <93>   DW_AT_decl_file   : 1
    <94>   DW_AT_decl_line   : 9
    <95>   DW_AT_type        : <0xb5>
 <2><99>: Abbrev Number: 6 (DW_TAG_inlined_subroutine)
    <9a>   DW_AT_abstract_origin: <0x3f>
    <9e>   DW_AT_low_pc      : 0x201752
    <a6>   DW_AT_high_pc     : 0x5
    <aa>   DW_AT_call_file   : 1
    <ab>   DW_AT_call_line   : 10
    <ac>   DW_AT_call_column : 
 <2><ad>: Abbrev Number: 0
 <1><ae>: Abbrev Number: 7 (DW_TAG_base_type)
    <af>   DW_AT_name        : (indirect string, offset: 0xd): int
    <b3>   DW_AT_encoding    : 5	(signed)
    <b4>   DW_AT_byte_size   : 4
 <1><b5>: Abbrev Number: 8 (DW_TAG_structure_type)
    <b6>   DW_AT_name        : (indirect string, offset: 0xb): s
    <ba>   DW_AT_byte_size   : 4
    <bb>   DW_AT_decl_file   : 1
    <bc>   DW_AT_decl_line   : 1
 <2><bd>: Abbrev Number: 9 (DW_TAG_member)
    <be>   DW_AT_name        : (indirect string, offset: 0xa2): x
    <c2>   DW_AT_type        : <0xae>
    <c6>   DW_AT_decl_file   : 1
    <c7>   DW_AT_decl_line   : 2
    <c8>   DW_AT_data_member_location: 0
 <2><c9>: Abbrev Number: 0
 <1><ca>: Abbrev Number: 0

             reply	other threads:[~2020-01-10 16:44 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-10 16:44 Stanislav Fomichev [this message]
2020-01-10 16:47 ` pahole and LTO Arnaldo Carvalho de Melo
2020-01-10 17:22   ` Stanislav Fomichev
2020-01-10 17:29     ` Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200110164410.GA1075235@mini-arch \
    --to=sdf@fomichev.me \
    --cc=acme@kernel.org \
    --cc=andriin@fb.com \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=morbo@google.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).