All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: David Blaikie <dblaikie@gmail.com>
Cc: Yonghong Song <yhs@fb.com>,
	Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>,
	dwarves@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
	Bill Wendling <morbo@google.com>, bpf <bpf@vger.kernel.org>,
	kernel-team@fb.com, Nick Desaulniers <ndesaulniers@google.com>
Subject: Re: [PATCH dwarves] dwarf_loader: handle subprogram ret type with abstract_origin properly
Date: Fri, 2 Apr 2021 11:04:18 -0300	[thread overview]
Message-ID: <YGckYjyfxfNLzc34@kernel.org> (raw)
In-Reply-To: <CAENS6EsiRsY1JptWJqu2wH=m4fkSiR+zD8JDD5DYke=ZnJOMrg@mail.gmail.com>

Em Thu, Apr 01, 2021 at 05:00:46PM -0700, David Blaikie escreveu:
> On Thu, Apr 1, 2021 at 4:41 PM Yonghong Song <yhs@fb.com> wrote:
> >
> >
> >
> > On 4/1/21 3:27 PM, David Blaikie wrote:
> > > On Thu, Apr 1, 2021 at 2:41 PM Yonghong Song <yhs@fb.com
> > > <mailto:yhs@fb.com>> wrote:
> > >  >
> > >  >
> > >  >
> > >  > On 4/1/21 2:36 PM, Yonghong Song wrote:
> > >  > > With latest bpf-next built with clang lto (thin or full), I hit one
> > > test
> > >  > > failures:
> > >  > >    $ ./test_progs -t tcp
> > >  > >    ...
> > >  > >    libbpf: extern (func ksym) 'tcp_slow_start': func_proto [23]
> > > incompatible with kernel [115303]
> > >  > >    libbpf: failed to load object 'bpf_cubic'
> > >  > >    libbpf: failed to load BPF skeleton 'bpf_cubic': -22
> > >  > >    test_cubic:FAIL:bpf_cubic__open_and_load failed
> > >  > >    #9/2 cubic:FAIL
> > >  > >    ...
> > >  > >
> > >  > > The reason of the failure is due to bpf program 'tcp_slow_start'
> > >  > > func signature is different from vmlinux BTF. bpf program uses
> > >  > > the following signature:
> > >  > >    extern __u32 tcp_slow_start(struct tcp_sock *tp, __u32 acked);
> > >  > > which is identical to the kernel definition in linux:include/net/tcp.h:
> > >  > >    u32 tcp_slow_start(struct tcp_sock *tp, u32 acked);
> > >  > > While vmlinux BTF definition like:
> > >  > >    [115303] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2
> > >  > >            'tp' type_id=39373
> > >  > >            'acked' type_id=18
> > >  > >    [115304] FUNC 'tcp_slow_start' type_id=115303 linkage=static
> > >  > > The above is dumped with `bpftool btf dump file vmlinux`.
> > >  > > You can see the ret_type_id is 0 and this caused the problem.
> > >  > >
> > >  > > Looking at dwarf, we have:
> > >  > >
> > >  > > 0x11f2ec67:   DW_TAG_subprogram
> > >  > >                  DW_AT_low_pc    (0xffffffff81ed2330)
> > >  > >                  DW_AT_high_pc   (0xffffffff81ed235c)
> > >  > >                  DW_AT_frame_base        ()
> > >  > >                  DW_AT_GNU_all_call_sites        (true)
> > >  > >                  DW_AT_abstract_origin   (0x11f2ed66 "tcp_slow_start")
> > >  > > ...
> > >  > > 0x11f2ed66:   DW_TAG_subprogram
> > >  > >                  DW_AT_name      ("tcp_slow_start")
> > >  > >                  DW_AT_decl_file
> > > ("/home/yhs/work/bpf-next/net/ipv4/tcp_cong.c")
> > >  > >                  DW_AT_decl_line (392)
> > >  > >                  DW_AT_prototyped        (true)
> > >  > >                  DW_AT_type      (0x11f130c2 "u32")
> > >  > >                  DW_AT_external  (true)
> > >  > >                  DW_AT_inline    (DW_INL_inlined)
> > >  >
> > >  > David,
> > >  >
> > >  > Could you help confirm whether DW_AT_abstract_origin at a
> > >  > DW_TAG_subprogram always points to another DW_TAG_subprogram,
> > >  > or there are possible other cases?
> > >
> > > That's correct, so far as I understand the spec, specifically DWARFv5
> > > <http://dwarfstd.org/doc/DWARF5.pdf>
> > > 3.3.8.3 says:
> > >
> > > "The root entry for a concrete out-of-line instance of a given inlined
> > > subroutine has the same tag as does its associated (abstract) inlined
> > > subroutine entry (that is, tag DW_TAG_subprogram rather than
> > > DW_TAG_inlined_subroutine)."
> >
> > Thanks. That means that some of my codes in the patch is
> > dead code.
> >
> > >
> > > Though people may come up with novel uses of DWARF features. What would
> > > happen if this constraint were violated/what's your motivation for
> > > asking (I don't quite understand the connection between test_progs
> > > failure description, and this question)
> >
> > I have some codes to check the tag associated with abstract_origin
> > for a subprogram must be a subprogram. Through experiment, I didn't
> > see a violation, so I wonder that I can get confirmation from you
> > and then I may delete that code.
> >
> > The test_progs failure exposed the bug, that is all.
> >
> > pahole cannot handle all weird usages of dwarf, so I think pahole
> > is fine only to support well-formed dwarf.
> 
> Sounds good. Thanks for the context!

David, since you took the time to go thru the changes and to agree that
Yonghong's fix is good, can I add a:

Acked-by: David Blaikie <dblaikie@gmail.com>

to this patch?

Maybe even a:

Reviewed-by: David Blaikie <dblaikie@gmail.com>

Thanks,

- Arnaldo
 
> >
> > >
> > > - David
> > >
> > >  >
> > >  > Thanks,
> > >  >
> > >  > >
> > >  > > We have a subprogram which has an abstract_origin pointing to
> > >  > > the subprogram prototype with return type. Current one pass
> > >  > > recoding cannot easily resolve this easily since
> > >  > > at the time recoding for 0x11f2ec67, the return type in
> > >  > > 0x11f2ed66 has not been resolved.
> > >  > >
> > >  > > To simplify implementation, I just added another pass to
> > >  > > go through all functions after recoding pass. This should
> > >  > > resolve the above issue.
> > >  > >
> > >  > > With this patch, among total 250999 functions in vmlinux,
> > >  > > 4821 functions needs return type adjustment from type id 0
> > >  > > to correct values. The above failed bpf selftest passed too.
> > >  > >
> > >  > > Signed-off-by: Yonghong Song <yhs@fb.com <mailto:yhs@fb.com>>
> > >  > > ---
> > >  > >   dwarf_loader.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
> > >  > >   1 file changed, 46 insertions(+)
> > >  > >
> > >  > > Arnaldo, this is the last known pahole bug in my hand w.r.t. clang
> > >  > > LTO. With this, all self tests are passed except ones due
> > >  > > to global function inlining, static variable promotion etc, which
> > >  > > are not related to pahole.
> > >  > >
> > >  > > diff --git a/dwarf_loader.c b/dwarf_loader.c
> > >  > > index 026d137..367ac06 100644
> > >  > > --- a/dwarf_loader.c
> > >  > > +++ b/dwarf_loader.c
> > >  > > @@ -2198,6 +2198,42 @@ out:
> > >  > >       return 0;
> > >  > >   }
> > >  > >
> > >  > > +static int cu__resolve_func_ret_types(struct cu *cu)
> > >  > > +{
> > >  > > +     struct ptr_table *pt = &cu->functions_table;
> > >  > > +     uint32_t i;
> > >  > > +
> > >  > > +     for (i = 0; i < pt->nr_entries; ++i) {
> > >  > > +             struct tag *tag = pt->entries[i];
> > >  > > +
> > >  > > +             if (tag == NULL || tag->type != 0)
> > >  > > +                     continue;
> > >  > > +
> > >  > > +             struct function *fn = tag__function(tag);
> > >  > > +             if (!fn->abstract_origin)
> > >  > > +                     continue;
> > >  > > +
> > >  > > +             struct dwarf_tag *dtag = tag->priv;
> > >  > > +             struct dwarf_tag *dfunc;
> > >  > > +             dfunc = dwarf_cu__find_tag_by_ref(cu->priv,
> > > &dtag->abstract_origin);
> > >  > > +             if (dfunc == NULL) {
> > >  > > +                     tag__print_abstract_origin_not_found(tag);
> > >  > > +                     return -1;
> > >  > > +             }
> > >  > > +
> > >  > > +             /*
> > >  > > +              * Based on what I see it should be a subprogram,
> > >  > > +              * but double check anyway to ensure I won't mess up
> > >  > > +              * now and in the future.
> > >  > > +              */
> > >  > > +             if (dfunc->tag->tag != DW_TAG_subprogram)
> > >  > > +                     continue;
> > >  > > +
> > >  > > +             tag->type = dfunc->tag->type;
> > >  > > +     }
> > >  > > +     return 0;
> > >  > > +}
> > >  > > +
> > >  > >   static int cu__recode_dwarf_types_table(struct cu *cu,
> > >  > >                                       struct ptr_table *pt,
> > >  > >                                       uint32_t i)
> > >  > > @@ -2637,6 +2673,16 @@ static int cus__merge_and_process_cu(struct
> > > cus *cus, struct conf_load *conf,
> > >  > >       /* process merged cu */
> > >  > >       if (cu__recode_dwarf_types(cu) != LSK__KEEPIT)
> > >  > >               return DWARF_CB_ABORT;
> > >  > > +
> > >  > > +     /*
> > >  > > +      * for lto build, the function return type may not be
> > >  > > +      * resolved due to the return type of a subprogram is
> > >  > > +      * encoded in another subprogram through abstract_origin
> > >  > > +      * tag. Let us visit all subprograms again to resolve this.
> > >  > > +      */
> > >  > > +     if (cu__resolve_func_ret_types(cu) != LSK__KEEPIT)
> > >  > > +             return DWARF_CB_ABORT;
> > >  > > +
> > >  > >       if (finalize_cu_immediately(cus, cu, dcu, conf)
> > >  > >           == LSK__STOP_LOADING)
> > >  > >               return DWARF_CB_ABORT;
> > >  > >

-- 

- Arnaldo

  reply	other threads:[~2021-04-02 14:04 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-01 21:36 [PATCH dwarves] dwarf_loader: handle subprogram ret type with abstract_origin properly Yonghong Song
2021-04-01 21:41 ` Yonghong Song
2021-04-01 22:30   ` David Blaikie
     [not found]   ` <CAENS6EsZ5OX9o=Cn5L1jmx8ucR9siEWbGYiYHCUWuZjLyP3E7Q@mail.gmail.com>
2021-04-01 23:40     ` Yonghong Song
2021-04-02  0:00       ` David Blaikie
2021-04-02 14:04         ` Arnaldo Carvalho de Melo [this message]
2021-04-02 14:57           ` Arnaldo Carvalho de Melo
2021-04-02 17:23             ` Yonghong Song
2021-04-02 17:42               ` Yonghong Song
2021-04-02 18:08                 ` Arnaldo
2021-04-02 19:01                   ` Jiri Olsa
2021-04-02 19:41                   ` Yonghong Song
2021-04-03 17:08                     ` Arnaldo Carvalho de Melo
2021-04-04 17:22                       ` Jiri Olsa
2021-04-07 14:54                   ` Yonghong Song
2021-04-07 15:37                     ` Arnaldo Carvalho de Melo
2021-04-07 15:40                       ` [RFT] prepping up pahole 1.21, wanna test it? was " Arnaldo Carvalho de Melo
2021-04-07 22:53                         ` Yonghong Song
2021-04-09 13:52                           ` Arnaldo Carvalho de Melo
2021-04-08 21:23                         ` Jiri Olsa
2021-04-09 13:51                           ` Arnaldo Carvalho de Melo
2021-04-07 22:51                       ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YGckYjyfxfNLzc34@kernel.org \
    --to=acme@kernel.org \
    --cc=arnaldo.melo@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=dblaikie@gmail.com \
    --cc=dwarves@vger.kernel.org \
    --cc=kernel-team@fb.com \
    --cc=morbo@google.com \
    --cc=ndesaulniers@google.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.