public inbox for dwarves@vger.kernel.org
 help / color / mirror / Atom feed
From: Alan Maguire <alan.maguire@oracle.com>
To: Jiri Olsa <olsajiri@gmail.com>, Ihor Solodrai <ihor.solodrai@linux.dev>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
	Menglong Dong <menglong8.dong@gmail.com>,
	dwarves@vger.kernel.org, bpf@vger.kernel.org,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andriin@fb.com>, Yonghong Song <yhs@fb.com>,
	Song Liu <songliubraving@fb.com>,
	Eduard Zingerman <eddyz87@gmail.com>
Subject: Re: [RFC dwarves] btf_encoder: Remove duplicates from functions entries
Date: Thu, 24 Jul 2025 18:54:53 +0100	[thread overview]
Message-ID: <f44af47f-e05e-4fa4-95ca-bf95f04e4c27@oracle.com> (raw)
In-Reply-To: <aIDFh26qdAURVL0u@krava>

On 23/07/2025 12:22, Jiri Olsa wrote:
> On Tue, Jul 22, 2025 at 10:58:52PM +0000, Ihor Solodrai wrote:
> 
> SNIP
> 
>> @@ -1338,48 +1381,39 @@ static int32_t btf_encoder__add_func(struct btf_encoder *encoder,
>>  	return 0;
>>  }
>>  
>> -static int functions_cmp(const void *_a, const void *_b)
>> +static int elf_function__name_cmp(const void *_a, const void *_b)
>>  {
>>  	const struct elf_function *a = _a;
>>  	const struct elf_function *b = _b;
>>  
>> -	/* if search key allows prefix match, verify target has matching
>> -	 * prefix len and prefix matches.
>> -	 */
>> -	if (a->prefixlen && a->prefixlen == b->prefixlen)
>> -		return strncmp(a->name, b->name, b->prefixlen);
> 
> nice to see this one removed ;-)
> 
>>  	return strcmp(a->name, b->name);
>>  }
>>  
>> -#ifndef max
>> -#define max(x, y) ((x) < (y) ? (y) : (x))
>> -#endif
>> -
>>  static int saved_functions_cmp(const void *_a, const void *_b)
>>  {
>>  	const struct btf_encoder_func_state *a = _a;
>>  	const struct btf_encoder_func_state *b = _b;
>>  
>> -	return functions_cmp(a->elf, b->elf);
>> +	return elf_function__name_cmp(a->elf, b->elf);
>>  }
>>  
>>  static int saved_functions_combine(struct btf_encoder_func_state *a, struct btf_encoder_func_state *b)
>>  {
>> -	uint8_t optimized, unexpected, inconsistent;
>> -	int ret;
>> +	uint8_t optimized, unexpected, inconsistent, ambiguous_addr;
>> +
>> +	if (a->elf != b->elf)
>> +		return 1;
>>  
>> -	ret = strncmp(a->elf->name, b->elf->name,
>> -		      max(a->elf->prefixlen, b->elf->prefixlen));
>> -	if (ret != 0)
>> -		return ret;
>>  	optimized = a->optimized_parms | b->optimized_parms;
>>  	unexpected = a->unexpected_reg | b->unexpected_reg;
>>  	inconsistent = a->inconsistent_proto | b->inconsistent_proto;
>> -	if (!unexpected && !inconsistent && !funcs__match(a, b))
>> +	ambiguous_addr = a->ambiguous_addr | b->ambiguous_addr;
>> +	if (!unexpected && !inconsistent && !ambiguous_addr && !funcs__match(a, b))
>>  		inconsistent = 1;
>>  	a->optimized_parms = b->optimized_parms = optimized;
>>  	a->unexpected_reg = b->unexpected_reg = unexpected;
>>  	a->inconsistent_proto = b->inconsistent_proto = inconsistent;
>> +	a->ambiguous_addr = b->ambiguous_addr = ambiguous_addr;
> 
> 
> I had to add change below to get the functions with multiple addresses out
> 
> diff --git a/btf_encoder.c b/btf_encoder.c
> index fcc30aa9d97f..7b9679794790 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -1466,7 +1466,7 @@ static int btf_encoder__add_saved_funcs(struct btf_encoder *encoder, bool skip_e
>  		 * just do not _use_ them.  Only exclude functions with
>  		 * unexpected register use or multiple inconsistent prototypes.
>  		 */
> -		add_to_btf |= !state->unexpected_reg && !state->inconsistent_proto;
> +		add_to_btf |= !state->unexpected_reg && !state->inconsistent_proto && !state->ambiguous_addr;
>  
>  		if (add_to_btf) {
>  			err = btf_encoder__add_func(state->encoder, state);
> 
> 
> other than that I like the approach
>

Thanks for the patch! I ran it through CI [1] with the above change plus
an added whitespace after the function name in the printf() in
btf_encoder__log_func_skip(). The btf_functions.sh test expects
whitespace after function names when examining skipped functions, so
either the test should be updated to handle no whitespace or we should
ensure the space is there after the function name like this:

        printf("%s : skipping BTF encoding of function due to ",
func->name);

Otherwise we get a CI failure that is nothing to do with the changes.

With this in place we do however lose a lot of functions it seems, some
I suspect unnecessarily. For example:


Looking at

 < void __tcp_send_ack(struct sock * sk, u32 rcv_nxt, u16 flags);

ffffffff83c83170 t __tcp_send_ack.part.0
ffffffff83c83310 T __tcp_send_ack

So __tcp_send_ack is partially inlined, but partial inlines should not
count as ambiguous addresses I think. We should probably ensure we skip
.part suffixes as well as .cold in calculating ambiguous addresses.

I modified the patch somewhat and we wind up losing ~400 functions
instead of over 700, see [2].

Modified patch is at [3]. If the mods look okay to you Ihor would you
mind sending it officially? Would be great to get wider testing to
ensure it doesn't break anything or leave any functions out unexpectedly.

> SNIP
> 
>> @@ -2153,18 +2191,75 @@ static int elf_functions__collect(struct elf_functions *functions)
>>  		goto out_free;
>>  	}
>>  
>> +	/* First, collect an elf_function for each GElf_Sym
>> +	 * Where func->name is without a suffix
>> +	 */
>>  	functions->cnt = 0;
>>  	elf_symtab__for_each_symbol_index(functions->symtab, core_id, sym, sym_sec_idx) {
>> -		elf_functions__collect_function(functions, &sym);
>> +
>> +		if (elf_sym__type(&sym) != STT_FUNC)
>> +			continue;
>> +
>> +		sym_name = elf_sym__name(&sym, functions->symtab);
>> +		if (!sym_name)
>> +			continue;
>> +
>> +		func = &functions->entries[functions->cnt];
>> +
>> +		const char *suffix = strchr(sym_name, '.');
>> +		if (suffix) {
>> +			functions->suffix_cnt++;
> 
> do we need suffix_cnt now?
> 

think it's been unused for a while now, so can be removed I think.

Thanks again for working on this!

Alan

[1] https://github.com/alan-maguire/dwarves/actions/runs/16500065295
[2]
https://github.com/alan-maguire/dwarves/actions/runs/16501897430/job/46662503155
[3]
https://github.com/acmel/dwarves/commit/30dffd7fc34e7753b3d21b4b3f1a5e17814c224f

> thanks,
> jirka
> 
> 
>> +			func->name = strndup(sym_name, suffix - sym_name);
>> +		} else {
>> +			func->name = strdup(sym_name);
>> +		}
>> +		if (!func->name) {
>> +			err = -ENOMEM;
>> +			goto out_free;
>> +		}
>> +
>> +		func_sym.name = sym_name;
>> +		func_sym.addr = sym.st_value;
>> +
>> +		err = elf_function__push_sym(func, &func_sym);
>> +		if (err)
>> +			goto out_free;
>> +
>> +		functions->cnt++;
>>  	}
> 
> SNIP


  reply	other threads:[~2025-07-24 17:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-17 15:25 [RFC dwarves] btf_encoder: Remove duplicates from functions entries Jiri Olsa
2025-07-21 11:41 ` Alan Maguire
2025-07-21 14:27   ` Jiri Olsa
2025-07-21 14:32     ` Nick Alcock
2025-07-21 23:27     ` Ihor Solodrai
2025-07-22 10:45       ` Alan Maguire
2025-07-22 22:58         ` Ihor Solodrai
2025-07-23 11:22           ` Jiri Olsa
2025-07-24 17:54             ` Alan Maguire [this message]
2025-07-24 21:26               ` Ihor Solodrai
2025-07-22 10:54       ` Jiri Olsa
2025-07-22 16:07         ` Ihor Solodrai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f44af47f-e05e-4fa4-95ca-bf95f04e4c27@oracle.com \
    --to=alan.maguire@oracle.com \
    --cc=acme@kernel.org \
    --cc=andriin@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=dwarves@vger.kernel.org \
    --cc=eddyz87@gmail.com \
    --cc=ihor.solodrai@linux.dev \
    --cc=menglong8.dong@gmail.com \
    --cc=olsajiri@gmail.com \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox