bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ihor Solodrai <ihor.solodrai@linux.dev>
To: Alan Maguire <alan.maguire@oracle.com>, Jiri Olsa <olsajiri@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
	Menglong Dong <menglong8.dong@gmail.com>,
	dwarves@vger.kernel.org, bpf@vger.kernel.org,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andriin@fb.com>, Yonghong Song <yhs@fb.com>,
	Song Liu <songliubraving@fb.com>,
	Eduard Zingerman <eddyz87@gmail.com>
Subject: Re: [RFC dwarves] btf_encoder: Remove duplicates from functions entries
Date: Thu, 24 Jul 2025 14:26:46 -0700	[thread overview]
Message-ID: <83d1b791-85fd-49ea-9c40-f3ba4c23850d@linux.dev> (raw)
In-Reply-To: <f44af47f-e05e-4fa4-95ca-bf95f04e4c27@oracle.com>

On 7/24/25 10:54 AM, Alan Maguire wrote:
> On 23/07/2025 12:22, Jiri Olsa wrote:
>> On Tue, Jul 22, 2025 at 10:58:52PM +0000, Ihor Solodrai wrote:
>>
>> SNIP
>>
>>> @@ -1338,48 +1381,39 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder,
>>>   	return 0;
>>>   }
>>>   
>>> -static int functions_cmp(const void *_a, const void *_b)
>>> +static int elf_function__name_cmp(const void *_a, const void *_b)
>>>   {
>>>   	const struct elf_function *a = _a;
>>>   	const struct elf_function *b = _b;
>>>   
>>> -	/* if search key allows prefix match, verify target has matching
>>> -	 * prefix len and prefix matches.
>>> -	 */
>>> -	if (a->prefixlen && a->prefixlen == b->prefixlen)
>>> -		return strncmp(a->name, b->name, b->prefixlen);
>>
>> nice to see this one removed ;-)
>>
>>>   	return strcmp(a->name, b->name);
>>>   }
>>>   
>>> -#ifndef max
>>> -#define max(x, y) ((x) < (y) ? (y) : (x))
>>> -#endif
>>> -
>>>   static int saved_functions_cmp(const void *_a, const void *_b)
>>>   {
>>>   	const struct btf_encoder_func_state *a = _a;
>>>   	const struct btf_encoder_func_state *b = _b;
>>>   
>>> -	return functions_cmp(a->elf, b->elf);
>>> +	return elf_function__name_cmp(a->elf, b->elf);
>>>   }
>>>   
>>>   static int saved_functions_combine(struct btf_encoder_func_state *a, struct btf_encoder_func_state *b)
>>>   {
>>> -	uint8_t optimized, unexpected, inconsistent;
>>> -	int ret;
>>> +	uint8_t optimized, unexpected, inconsistent, ambiguous_addr;
>>> +
>>> +	if (a->elf != b->elf)
>>> +		return 1;
>>>   
>>> -	ret = strncmp(a->elf->name, b->elf->name,
>>> -		      max(a->elf->prefixlen, b->elf->prefixlen));
>>> -	if (ret != 0)
>>> -		return ret;
>>>   	optimized = a->optimized_parms | b->optimized_parms;
>>>   	unexpected = a->unexpected_reg | b->unexpected_reg;
>>>   	inconsistent = a->inconsistent_proto | b->inconsistent_proto;
>>> -	if (!unexpected && !inconsistent && !funcs__match(a, b))
>>> +	ambiguous_addr = a->ambiguous_addr | b->ambiguous_addr;
>>> +	if (!unexpected && !inconsistent && !ambiguous_addr && !funcs__match(a, b))
>>>   		inconsistent = 1;
>>>   	a->optimized_parms = b->optimized_parms = optimized;
>>>   	a->unexpected_reg = b->unexpected_reg = unexpected;
>>>   	a->inconsistent_proto = b->inconsistent_proto = inconsistent;
>>> +	a->ambiguous_addr = b->ambiguous_addr = ambiguous_addr;
>>
>>
>> I had to add change below to get the functions with multiple addresses out
>>
>> diff --git a/btf_encoder.c b/btf_encoder.c
>> index fcc30aa9d97f..7b9679794790 100644
>> --- a/btf_encoder.c
>> +++ b/btf_encoder.c
>> @@ -1466,7 +1466,7 @@ static int btf_encoder__add_saved_funcs(struct btf_encoder *encoder, bool skip_e
>>   		 * just do not _use_ them.  Only exclude functions with
>>   		 * unexpected register use or multiple inconsistent prototypes.
>>   		 */
>> -		add_to_btf |= !state->unexpected_reg && !state->inconsistent_proto;
>> +		add_to_btf |= !state->unexpected_reg && !state->inconsistent_proto && !state->ambiguous_addr;
>>   
>>   		if (add_to_btf) {
>>   			err = btf_encoder__add_func(state->encoder, state);
>>
>>
>> other than that I like the approach
>>
> 
> Thanks for the patch! I ran it through CI [1] with the above change plus
> an added whitespace after the function name in the printf() in
> btf_encoder__log_func_skip(). The btf_functions.sh test expects
> whitespace after function names when examining skipped functions, so
> either the test should be updated to handle no whitespace or we should
> ensure the space is there after the function name like this:
> 
>          printf("%s : skipping BTF encoding of function due to ",
> func->name);
> 
> Otherwise we get a CI failure that is nothing to do with the changes.
> 
> With this in place we do however lose a lot of functions it seems, some
> I suspect unnecessarily. For example:
> 
> 
> Looking at
> 
>   < void __tcp_send_ack(struct sock * sk, u32 rcv_nxt, u16 flags);
> 
> ffffffff83c83170 t __tcp_send_ack.part.0
> ffffffff83c83310 T __tcp_send_ack
> 
> So __tcp_send_ack is partially inlined, but partial inlines should not
> count as ambiguous addresses I think. We should probably ensure we skip
> .part suffixes as well as .cold in calculating ambiguous addresses.
> 
> I modified the patch somewhat and we wind up losing ~400 functions
> instead of over 700, see [2].
> 
> Modified patch is at [3]. If the mods look okay to you Ihor would you
> mind sending it officially? Would be great to get wider testing to
> ensure it doesn't break anything or leave any functions out unexpectedly.

Alan, Jiri, thank you for review and testing. I sent this draft in a bit 
of a rush, sorry.

I'll incorporate your suggestions, test the patch a bit more and then
will send a clean version. I am curious what functions are lost and
why, will report if notice anything interesting.

> 
>> SNIP
>>
>>> @@ -2153,18 +2191,75 @@ static int elf_functions__collect(struct elf_functions *functions)
>>>   		goto out_free;
>>>   	}
>>>   
>>> +	/* First, collect an elf_function for each GElf_Sym
>>> +	 * Where func->name is without a suffix
>>> +	 */
>>>   	functions->cnt = 0;
>>>   	elf_symtab__for_each_symbol_index(functions->symtab, core_id, sym, sym_sec_idx) {
>>> -		elf_functions__collect_function(functions, &sym);
>>> +
>>> +		if (elf_sym__type(&sym) != STT_FUNC)
>>> +			continue;
>>> +
>>> +		sym_name = elf_sym__name(&sym, functions->symtab);
>>> +		if (!sym_name)
>>> +			continue;
>>> +
>>> +		func = &functions->entries[functions->cnt];
>>> +
>>> +		const char *suffix = strchr(sym_name, '.');
>>> +		if (suffix) {
>>> +			functions->suffix_cnt++;
>>
>> do we need suffix_cnt now?
>>
> 
> think it's been unused for a while now, so can be removed I think.
> 
> Thanks again for working on this!
> 
> Alan
> 
> [1] https://github.com/alan-maguire/dwarves/actions/runs/16500065295
> [2]
> https://github.com/alan-maguire/dwarves/actions/runs/16501897430/job/46662503155
> [3]
> https://github.com/acmel/dwarves/commit/30dffd7fc34e7753b3d21b4b3f1a5e17814c224f
> 
>> thanks,
>> jirka
>>
>>
>>> +			func->name = strndup(sym_name, suffix - sym_name);
>>> +		} else {
>>> +			func->name = strdup(sym_name);
>>> +		}
>>> +		if (!func->name) {
>>> +			err = -ENOMEM;
>>> +			goto out_free;
>>> +		}
>>> +
>>> +		func_sym.name = sym_name;
>>> +		func_sym.addr = sym.st_value;
>>> +
>>> +		err = elf_function__push_sym(func, &func_sym);
>>> +		if (err)
>>> +			goto out_free;
>>> +
>>> +		functions->cnt++;
>>>   	}
>>
>> SNIP
> 


  reply	other threads:[~2025-07-24 21:27 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-17 15:25 [RFC dwarves] btf_encoder: Remove duplicates from functions entries Jiri Olsa
2025-07-21 11:41 ` Alan Maguire
2025-07-21 14:27   ` Jiri Olsa
2025-07-21 14:32     ` Nick Alcock
2025-07-21 23:27     ` Ihor Solodrai
2025-07-22 10:45       ` Alan Maguire
2025-07-22 22:58         ` Ihor Solodrai
2025-07-23 11:22           ` Jiri Olsa
2025-07-24 17:54             ` Alan Maguire
2025-07-24 21:26               ` Ihor Solodrai [this message]
2025-07-22 10:54       ` Jiri Olsa
2025-07-22 16:07         ` Ihor Solodrai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83d1b791-85fd-49ea-9c40-f3ba4c23850d@linux.dev \
    --to=ihor.solodrai@linux.dev \
    --cc=acme@kernel.org \
    --cc=alan.maguire@oracle.com \
    --cc=andriin@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=dwarves@vger.kernel.org \
    --cc=eddyz87@gmail.com \
    --cc=menglong8.dong@gmail.com \
    --cc=olsajiri@gmail.com \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).