From: Ihor Solodrai <ihor.solodrai@linux.dev>
To: Alan Maguire <alan.maguire@oracle.com>, Jiri Olsa <olsajiri@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
Menglong Dong <menglong8.dong@gmail.com>,
dwarves@vger.kernel.org, bpf@vger.kernel.org,
Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andriin@fb.com>, Yonghong Song <yhs@fb.com>,
Song Liu <songliubraving@fb.com>,
Eduard Zingerman <eddyz87@gmail.com>
Subject: Re: [RFC dwarves] btf_encoder: Remove duplicates from functions entries
Date: Thu, 24 Jul 2025 14:26:46 -0700 [thread overview]
Message-ID: <83d1b791-85fd-49ea-9c40-f3ba4c23850d@linux.dev> (raw)
In-Reply-To: <f44af47f-e05e-4fa4-95ca-bf95f04e4c27@oracle.com>
On 7/24/25 10:54 AM, Alan Maguire wrote:
> On 23/07/2025 12:22, Jiri Olsa wrote:
>> On Tue, Jul 22, 2025 at 10:58:52PM +0000, Ihor Solodrai wrote:
>>
>> SNIP
>>
>>> @@ -1338,48 +1381,39 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder,
>>> return 0;
>>> }
>>>
>>> -static int functions_cmp(const void *_a, const void *_b)
>>> +static int elf_function__name_cmp(const void *_a, const void *_b)
>>> {
>>> const struct elf_function *a = _a;
>>> const struct elf_function *b = _b;
>>>
>>> - /* if search key allows prefix match, verify target has matching
>>> - * prefix len and prefix matches.
>>> - */
>>> - if (a->prefixlen && a->prefixlen == b->prefixlen)
>>> - return strncmp(a->name, b->name, b->prefixlen);
>>
>> nice to see this one removed ;-)
>>
>>> return strcmp(a->name, b->name);
>>> }
>>>
>>> -#ifndef max
>>> -#define max(x, y) ((x) < (y) ? (y) : (x))
>>> -#endif
>>> -
>>> static int saved_functions_cmp(const void *_a, const void *_b)
>>> {
>>> const struct btf_encoder_func_state *a = _a;
>>> const struct btf_encoder_func_state *b = _b;
>>>
>>> - return functions_cmp(a->elf, b->elf);
>>> + return elf_function__name_cmp(a->elf, b->elf);
>>> }
>>>
>>> static int saved_functions_combine(struct btf_encoder_func_state *a, struct btf_encoder_func_state *b)
>>> {
>>> - uint8_t optimized, unexpected, inconsistent;
>>> - int ret;
>>> + uint8_t optimized, unexpected, inconsistent, ambiguous_addr;
>>> +
>>> + if (a->elf != b->elf)
>>> + return 1;
>>>
>>> - ret = strncmp(a->elf->name, b->elf->name,
>>> - max(a->elf->prefixlen, b->elf->prefixlen));
>>> - if (ret != 0)
>>> - return ret;
>>> optimized = a->optimized_parms | b->optimized_parms;
>>> unexpected = a->unexpected_reg | b->unexpected_reg;
>>> inconsistent = a->inconsistent_proto | b->inconsistent_proto;
>>> - if (!unexpected && !inconsistent && !funcs__match(a, b))
>>> + ambiguous_addr = a->ambiguous_addr | b->ambiguous_addr;
>>> + if (!unexpected && !inconsistent && !ambiguous_addr && !funcs__match(a, b))
>>> inconsistent = 1;
>>> a->optimized_parms = b->optimized_parms = optimized;
>>> a->unexpected_reg = b->unexpected_reg = unexpected;
>>> a->inconsistent_proto = b->inconsistent_proto = inconsistent;
>>> + a->ambiguous_addr = b->ambiguous_addr = ambiguous_addr;
>>
>>
>> I had to add change below to get the functions with multiple addresses out
>>
>> diff --git a/btf_encoder.c b/btf_encoder.c
>> index fcc30aa9d97f..7b9679794790 100644
>> --- a/btf_encoder.c
>> +++ b/btf_encoder.c
>> @@ -1466,7 +1466,7 @@ static int btf_encoder__add_saved_funcs(struct btf_encoder *encoder, bool skip_e
>> * just do not _use_ them. Only exclude functions with
>> * unexpected register use or multiple inconsistent prototypes.
>> */
>> - add_to_btf |= !state->unexpected_reg && !state->inconsistent_proto;
>> + add_to_btf |= !state->unexpected_reg && !state->inconsistent_proto && !state->ambiguous_addr;
>>
>> if (add_to_btf) {
>> err = btf_encoder__add_func(state->encoder, state);
>>
>>
>> other than that I like the approach
>>
>
> Thanks for the patch! I ran it through CI [1] with the above change plus
> an added whitespace after the function name in the printf() in
> btf_encoder__log_func_skip(). The btf_functions.sh test expects
> whitespace after function names when examining skipped functions, so
> either the test should be updated to handle no whitespace or we should
> ensure the space is there after the function name like this:
>
> printf("%s : skipping BTF encoding of function due to ",
> func->name);
>
> Otherwise we get a CI failure that is nothing to do with the changes.
>
> With this in place we do however lose a lot of functions it seems, some
> I suspect unnecessarily. For example:
>
>
> Looking at
>
> < void __tcp_send_ack(struct sock * sk, u32 rcv_nxt, u16 flags);
>
> ffffffff83c83170 t __tcp_send_ack.part.0
> ffffffff83c83310 T __tcp_send_ack
>
> So __tcp_send_ack is partially inlined, but partial inlines should not
> count as ambiguous addresses I think. We should probably ensure we skip
> .part suffixes as well as .cold in calculating ambiguous addresses.
>
> I modified the patch somewhat and we wind up losing ~400 functions
> instead of over 700, see [2].
>
> Modified patch is at [3]. If the mods look okay to you Ihor would you
> mind sending it officially? Would be great to get wider testing to
> ensure it doesn't break anything or leave any functions out unexpectedly.
Alan, Jiri, thank you for review and testing. I sent this draft in a bit
of a rush, sorry.
I'll incorporate your suggestions, test the patch a bit more and then
will send a clean version. I am curious what functions are lost and
why, will report if notice anything interesting.
>
>> SNIP
>>
>>> @@ -2153,18 +2191,75 @@ static int elf_functions__collect(struct elf_functions *functions)
>>> goto out_free;
>>> }
>>>
>>> + /* First, collect an elf_function for each GElf_Sym
>>> + * Where func->name is without a suffix
>>> + */
>>> functions->cnt = 0;
>>> elf_symtab__for_each_symbol_index(functions->symtab, core_id, sym, sym_sec_idx) {
>>> - elf_functions__collect_function(functions, &sym);
>>> +
>>> + if (elf_sym__type(&sym) != STT_FUNC)
>>> + continue;
>>> +
>>> + sym_name = elf_sym__name(&sym, functions->symtab);
>>> + if (!sym_name)
>>> + continue;
>>> +
>>> + func = &functions->entries[functions->cnt];
>>> +
>>> + const char *suffix = strchr(sym_name, '.');
>>> + if (suffix) {
>>> + functions->suffix_cnt++;
>>
>> do we need suffix_cnt now?
>>
>
> think it's been unused for a while now, so can be removed I think.
>
> Thanks again for working on this!
>
> Alan
>
> [1] https://github.com/alan-maguire/dwarves/actions/runs/16500065295
> [2]
> https://github.com/alan-maguire/dwarves/actions/runs/16501897430/job/46662503155
> [3]
> https://github.com/acmel/dwarves/commit/30dffd7fc34e7753b3d21b4b3f1a5e17814c224f
>
>> thanks,
>> jirka
>>
>>
>>> + func->name = strndup(sym_name, suffix - sym_name);
>>> + } else {
>>> + func->name = strdup(sym_name);
>>> + }
>>> + if (!func->name) {
>>> + err = -ENOMEM;
>>> + goto out_free;
>>> + }
>>> +
>>> + func_sym.name = sym_name;
>>> + func_sym.addr = sym.st_value;
>>> +
>>> + err = elf_function__push_sym(func, &func_sym);
>>> + if (err)
>>> + goto out_free;
>>> +
>>> + functions->cnt++;
>>> }
>>
>> SNIP
>
next prev parent reply other threads:[~2025-07-24 21:27 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-17 15:25 [RFC dwarves] btf_encoder: Remove duplicates from functions entries Jiri Olsa
2025-07-21 11:41 ` Alan Maguire
2025-07-21 14:27 ` Jiri Olsa
2025-07-21 14:32 ` Nick Alcock
2025-07-21 23:27 ` Ihor Solodrai
2025-07-22 10:45 ` Alan Maguire
2025-07-22 22:58 ` Ihor Solodrai
2025-07-23 11:22 ` Jiri Olsa
2025-07-24 17:54 ` Alan Maguire
2025-07-24 21:26 ` Ihor Solodrai [this message]
2025-07-22 10:54 ` Jiri Olsa
2025-07-22 16:07 ` Ihor Solodrai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=83d1b791-85fd-49ea-9c40-f3ba4c23850d@linux.dev \
--to=ihor.solodrai@linux.dev \
--cc=acme@kernel.org \
--cc=alan.maguire@oracle.com \
--cc=andriin@fb.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=dwarves@vger.kernel.org \
--cc=eddyz87@gmail.com \
--cc=menglong8.dong@gmail.com \
--cc=olsajiri@gmail.com \
--cc=songliubraving@fb.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).