public inbox for dwarves@vger.kernel.org
 help / color / mirror / Atom feed
From: Jiri Olsa <olsajiri@gmail.com>
To: Eduard Zingerman <eddyz87@gmail.com>
Cc: Yonghong Song <yonghong.song@linux.dev>,
	Alan Maguire <alan.maguire@oracle.com>,
	Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>,
	dwarves@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, Daniel Borkmann <daniel@iogearbox.net>,
	kernel-team@fb.com, Song Liu <song@kernel.org>
Subject: Re: [PATCH dwarves 3/3] dwarf_loader: Check DW_OP_[GNU_]entry_value for possible parameter matching
Date: Mon, 11 Nov 2024 13:36:52 +0100	[thread overview]
Message-ID: <ZzH6ZDucO2nm9Y52@krava> (raw)
In-Reply-To: <080794545d8eb3df3d6eba90ac621111ab7171f5.camel@gmail.com>

On Mon, Nov 11, 2024 at 12:01:13AM -0800, Eduard Zingerman wrote:
> On Sun, 2024-11-10 at 03:38 -0800, Eduard Zingerman wrote:
> 
> [...]
> 
> > Also, it appears there is some bug either in pahole or in libdw's
> > implementation of dwarf_getlocation(). When I try both your patch-set
> > and my variant there is a segfault once in a while:
> > 
> >   $ for i in $(seq 1 100); \
> >     do echo "---> $i"; \
> >        pahole -j --skip_encoding_btf_inconsistent_proto -J --btf_encode_detached=/dev/null vmlinux ; \
> >     done
> >   ---> 1
> >   ...
> >   ---> 71
> >   Segmentation fault (core dumped)
> >   ...
> > 
> > The segfault happens only when -j (multiple threads) is passed.
> > If pahole is built with sanitizers
> > (passing -DCMAKE_C_FLAGS="-fsanitize=undefined,address")
> > the stack trace looks as follows:
> 
> Did some additional research for these SEGFAULTs.
> Looks like all we are in trouble.
> 
> # TLDR
> 
> libdw is not supposed to be used in a concurrent context.
> libdw is a part of elfutils package, the configuration flag
> making API thread-safe is documented as experimental:
>   --enable-thread-safety  enable thread safety of libraries EXPERIMENTAL
> At-least Fedora 40 does not ship elfutils built with this flag set.
> This colours all current parallel DWARF decoding questionable.
> 
> # Why segfault happens
> 
> Any references to elfutils source code are for commit [1].
> The dwarf_getlocation() is one of a few libdw APIs that uses memory
> allocation internally. The function dwarf_getlocation.c:__libdw_intern_expression
> iterates over expression encodings in DWARF and allocates
> a set of objects of type `struct loclist` and `Dwarf_Op`.
> Pointers to allocated objects are put to a binary tree for caching,
> see dwarf_getlocation.c:660, the call to eu_tsearch() function.
> The eu_tsearch() is a wrapper around libc tsearch() function.
> This wrapper provides locking for the tree,
> but only if --enable-thread-safety was set during elfutils configuration.
> The SEGFAULT happens inside tsearch() call because binary tree is malformed, e.g.:
> 
>   Thread 8 "pahole" received signal SIGSEGV, Segmentation fault.
>   [Switching to Thread 0x7fffd9c006c0 (LWP 2630074)]
>   0x00007ffff7c5d200 in maybe_split_for_insert (...) at tsearch.c:228
>   228	      if (parentp != NULL && RED(DEREFNODEPTR(parentp)))
>   (gdb) bt
>   #0  0x00007ffff7c5d200 in maybe_split_for_insert (...) at tsearch.c:228
>   #1  0x00007ffff7c5d466 in __GI___tsearch (...) at tsearch.c:358
>   #2  __GI___tsearch (...) at tsearch.c:290
>   #3  0x000000000048f096 in __interceptor_tsearch ()
>   #4  0x00007ffff7f5c482 in __libdw_intern_expression (...) at dwarf_getlocation.c:660
>   #5  0x00007ffff7f5cf51 in getlocation (...) at dwarf_getlocation.c:678
>   #6  getlocation (...) at dwarf_getlocation.c:667
>   #7  dwarf_getlocation (..._ at dwarf_getlocation.c:708
>   #8  0x00000000005a2ee5 in parameter.new ()
>   #9  0x00000000005a0122 in die.process_function ()
>   #10 0x0000000000597efd in __die__process_tag ()
>   #11 0x0000000000595ad9 in die.process_unit ()
>   #12 0x0000000000595436 in die.process ()
>   #13 0x00000000005b0187 in dwarf_cus.process_cu ()
>   #14 0x00000000005afa38 in dwarf_cus.process_cu_thread ()
>   #15 0x00000000004c7b8d in asan_thread_start(void*) ()
>   #16 0x00007ffff7bda6d7 in start_thread (arg=<optimized out>) at pthread_create.c:447
>   #17 0x00007ffff7c5e60c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
>   (gdb) p parentp
>   $1 = (node *) 0x50300079d2a0
>   (gdb) p *parentp
>   $2 = (node) 0x0
> 
> glibc provides a way to validate binary tree structure.
> For this misc/tsearch.c has to be changed to define DEBUGGING variable.
> (I used glibc 2.39 as provided by source rpm for Fedora 40 for experiments).
> If this is done and custom glibc is used for pahole execution,
> the following error is reported if '-j' flag is present:
> 
>   $ pahole -j --skip_encoding_btf_inconsistent_proto -J --btf_encode_detached=/home/eddy/work/tmp/my-new.btf vmlinux 
>   Fatal glibc error: tsearch.c:164 (check_tree_recurse): assertion failed: d_sofar == d_total
>   Fatal glibc error: tsearch.c:164 (check_tree_recurse): assertion failed: d_sofar == d_total
>   Aborted (core dumped)
> 
> Executing pahole using a custom-built libdw,
> built with --enable-thread-safety resolves the issue.

could we use libdw__lock around that? but I guess we use it on other
places as well..

jirka

> 
> [1] b2f225d6bff8 ("Consolidate and add files to clean target variables")
>     git://sourceware.org/git/elfutils.git
> 
> 

  reply	other threads:[~2024-11-11 12:36 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-08 18:05 [PATCH dwarves 0/3] Check DW_OP_[GNU_]entry_value for possible parameter matching Yonghong Song
2024-11-08 18:05 ` [PATCH dwarves 1/3] dwarf_loader: Refactor function parameter__new() Yonghong Song
2024-11-11 11:26   ` Alan Maguire
2024-11-08 18:05 ` [PATCH dwarves 2/3] dwarf_loader: Refactor function check_dwarf_locations() Yonghong Song
2024-11-08 18:05 ` [PATCH dwarves 3/3] dwarf_loader: Check DW_OP_[GNU_]entry_value for possible parameter matching Yonghong Song
2024-11-10 11:38   ` Eduard Zingerman
2024-11-11  8:01     ` Eduard Zingerman
2024-11-11 12:36       ` Jiri Olsa [this message]
2024-11-11 13:42         ` Arnaldo Carvalho de Melo
2024-11-15 16:26           ` elfutils thread-safety (Was: [PATCH dwarves 3/3] dwarf_loader: Check DW_OP_[GNU_]entry_value for possible parameter matching) Mark Wielaard
2024-11-11 18:51     ` [PATCH dwarves 3/3] dwarf_loader: Check DW_OP_[GNU_]entry_value for possible parameter matching Yonghong Song
2024-11-11  9:54   ` Jiri Olsa
2024-11-11 18:54     ` Yonghong Song
2024-11-11 15:39   ` Alan Maguire
2024-11-12  1:51     ` Yonghong Song
2024-11-12 16:56       ` Alan Maguire
2024-11-12 17:07         ` Yonghong Song
2024-11-12 18:33           ` Alan Maguire
2024-11-12 18:51             ` Yonghong Song
2024-11-12 19:10             ` Arnaldo Carvalho de Melo
2024-11-13 17:33               ` Alan Maguire
2024-11-13 18:27                 ` Yonghong Song
2024-11-14 12:16                   ` Alan Maguire
2024-11-14 16:29                     ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZzH6ZDucO2nm9Y52@krava \
    --to=olsajiri@gmail.com \
    --cc=alan.maguire@oracle.com \
    --cc=andrii@kernel.org \
    --cc=arnaldo.melo@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dwarves@vger.kernel.org \
    --cc=eddyz87@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox