Linux Kernel Selftest development
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@meta.com>
To: Espen Grindhaug <espen.grindhaug@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>, Mykola Lysenko <mykolal@fb.com>,
	Shuah Khan <shuah@kernel.org>,
	bpf@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v2] libbpf: Improve version handling when attaching uprobe
Date: Mon, 1 May 2023 08:23:35 -0700	[thread overview]
Message-ID: <533437a4-a76d-96e0-b04a-ab8eb7b5fb7f@meta.com> (raw)
In-Reply-To: <ZE+4Ct7ZMecFy7YV@eg>



On 5/1/23 6:00 AM, Espen Grindhaug wrote:
> On Thu, Apr 27, 2023 at 06:19:29PM -0700, Yonghong Song wrote:
>>
>>
>> On 4/27/23 12:19 PM, Espen Grindhaug wrote:
>>> On Wed, Apr 26, 2023 at 02:47:27PM -0700, Yonghong Song wrote:
>>>>
>>>>
>>>> On 4/23/23 11:55 AM, Espen Grindhaug wrote:
>>>>> This change fixes the handling of versions in elf_find_func_offset.
>>>>> In the previous implementation, we incorrectly assumed that the
>>>>
>>>> Could you give more explanation/example in the commit message
>>>> what does 'incorrectly' mean here? In which situations the
>>>> current libbpf implementation will not be correct?
>>>>
>>>
>>> How about something like this?
>>>
>>>
>>> libbpf: Improve version handling when attaching uprobe
>>>
>>> This change fixes the handling of versions in elf_find_func_offset.
>>>
>>> For example, let's assume we are trying to attach an uprobe to pthread_create in
>>> glibc. Prior to this commit, it would fail with an error message saying 'elf:
>>> ambiguous match [...]', this is because there are two entries in the symbol
>>> table with that name.
>>>
>>> $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep pthread_create
>>> 0000000000094cc0 T pthread_create@GLIBC_2.2.5
>>> 0000000000094cc0 T pthread_create@@GLIBC_2.34
>>>
>>> So we go ahead and modify our code to attach to 'pthread_create@@GLIBC_2.34',
>>> and this also fails, but this time with the error 'elf: failed to find symbol
>>> [...]'. This fails because we incorrectly assumed that the version information
>>> would be present in the string found in the string table, but there is only the
>>> string 'pthread_create'.
>>
>> I tried one example with my centos8 libpthread library.
>>
>> $ llvm-readelf -s /lib64/libc-2.28.so | grep pthread_cond_signal
>>      39: 0000000000095f70    43 FUNC    GLOBAL DEFAULT    14
>> pthread_cond_signal@@GLIBC_2.3.2
>>      40: 0000000000096250    43 FUNC    GLOBAL DEFAULT    14
>> pthread_cond_signal@GLIBC_2.2.5
>>    3160: 0000000000096250    43 FUNC    LOCAL  DEFAULT    14
>> __pthread_cond_signal_2_0
>>    3589: 0000000000095f70    43 FUNC    LOCAL  DEFAULT    14
>> __pthread_cond_signal
>>    5522: 0000000000095f70    43 FUNC    GLOBAL DEFAULT    14
>> pthread_cond_signal@@GLIBC_2.3.2
>>    5545: 0000000000096250    43 FUNC    GLOBAL DEFAULT    14
>> pthread_cond_signal@GLIBC_2.2.5
>> $ nm -D /lib64/libc-2.28.so | grep pthread_cond_signal
>> 0000000000095f70 T pthread_cond_signal@@GLIBC_2.3.2
>> 0000000000096250 T pthread_cond_signal@GLIBC_2.2.5
>> $
>>
>> Note that two pthread_cond_signal functions have different addresses,
>> which is expected as they implemented for different versions.
>>
>> But in your case,
>>> $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep pthread_create
>>> 0000000000094cc0 T pthread_create@GLIBC_2.2.5
>>> 0000000000094cc0 T pthread_create@@GLIBC_2.34
>>
>> Two functions have the same address which is very weird and I suspect
>> some issues here at least needs some investigation.
>>
> 
> I am no expert on this, but as far as I can tell, this is normal,
> although much more common on my Ubuntu machine than my Fedora machine.
> 
> Script to find duplicates:
> 
> nm -D /usr/lib64/libc-2.33.so | awk '
> {
>      addr = $1;
>      symbol = $3;
>      sub(/[@].*$/, "", symbol);
> 
>      if (addr == prev_addr && symbol == prev_symbol) {
>          if (prev_symbol_printed == 0) {
>              print prev_line;
>              prev_symbol_printed = 1;
>          }
>          print;
>      } else {
>          prev_symbol_printed = 0;
>      }
>      prev_addr = addr;
>      prev_symbol = symbol;
>      prev_line = $0;
> }'
> 
> 
>> Second, for the symbol table, the following is ELF encoding,
>>
>> typedef struct {
>>          Elf64_Word      st_name;
>>          unsigned char   st_info;
>>          unsigned char   st_other;
>>          Elf64_Half      st_shndx;
>>          Elf64_Addr      st_value;
>>          Elf64_Xword     st_size;
>> } Elf64_Sym;
>>
>> where
>> st_name
>>
>>      An index into the object file's symbol string table, which holds the
>> character representations of the symbol names. If the value is nonzero, the
>> value represents a string table index that gives the symbol name. Otherwise,
>> the symbol table entry has no name.
>>
>> So, the function name (including @..., @@...) should be in string table
>> which is the same for the above two pthread_cond_signal symbols.
>>
>> I think it is worthwhile to debug why in your situation
>> pthread_create@GLIBC_2.2.5 and pthread_create@@GLIBC_2.34 do not
>> have them in the string table.
>>
> 
> I think you are mistaken here; the strings in the strings table don't contain
> the version. Take a look at this partial dump of the strings table.
> 
> 	$ readelf -W -p .dynstr /usr/lib64/libc-2.33.so
> 
> 	String dump of section '.dynstr':
> 		[     1]  xdrmem_create
> 		[     f]  __wctomb_chk
> 		[    1c]  getmntent
> 		[    26]  __freelocale
> 		[    33]  __rawmemchr
> 		[    3f]  _IO_vsprintf
> 		[    4c]  getutent
> 		[    55]  __file_change_detection_for_path
> 	(...)
> 		[  350e]  memrchr
> 		[  3516]  pthread_cond_signal
> 		[  352a]  __close
> 	(...)
> 		[  61b6]  GLIBC_2.2.5
> 		[  61c2]  GLIBC_2.2.6
> 		[  61ce]  GLIBC_2.3
> 		[  61d8]  GLIBC_2.3.2
> 		[  61e4]  GLIBC_2.3.3
> 
> As you can see, the strings have no versions, and the version strings
> themselves are also in this table as entries at the end of the table.

I see you search .dynstr section. Do you think whether we should
search .strtab instead since it contains versioned symbols?

> 
>>>
>>> This patch reworks how we compare the symbol name provided by the user if it is
>>> qualified with a version (using @ or @@). We now look up the correct version
>>> string in the version symbol table before constructing the full name, as also
>>> done above by nm, before comparing.
>>>
>>>>> version information would be present in the string found in the
>>>>> string table.
>>>>>
>>>>> We now look up the correct version string in the version symbol
>>>>> table before constructing the full name and then comparing.
>>>>>
>>>>> This patch adds support for both name@version and name@@version to
>>>>> match output of the various elf parsers.
>>>>>
>>>>> Signed-off-by: Espen Grindhaug <espen.grindhaug@gmail.com>
>>>>
>>>> [...]

  reply	other threads:[~2023-05-01 15:24 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-23 18:55 [PATCH v2] libbpf: Improve version handling when attaching uprobe Espen Grindhaug
2023-04-26 21:47 ` Yonghong Song
2023-04-27 19:19   ` Espen Grindhaug
2023-04-28  1:19     ` Yonghong Song
2023-05-01 13:00       ` Espen Grindhaug
2023-05-01 15:23         ` Yonghong Song [this message]
2023-05-01 16:30           ` Espen Grindhaug
2023-05-01 17:20             ` Yonghong Song
2023-05-02  4:02 ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=533437a4-a76d-96e0-b04a-ab8eb7b5fb7f@meta.com \
    --to=yhs@meta.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=espen.grindhaug@gmail.com \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=mykolal@fb.com \
    --cc=sdf@google.com \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox