Re: [PATCH dwarves v1] btf_encoder: handle .BTF_ids section endianness when cross-compiling

public inbox for dwarves@vger.kernel.org
 help / color / mirror / Atom feed

From: Jiri Olsa <olsajiri@gmail.com>
To: Eduard Zingerman <eddyz87@gmail.com>
Cc: dwarves@vger.kernel.org, arnaldo.melo@gmail.com,
	bpf@vger.kernel.org, kernel-team@fb.com, ast@kernel.org,
	daniel@iogearbox.net, andrii@kernel.org, yonghong.song@linux.dev,
	Alan Maguire <alan.maguire@oracle.com>, Daniel Xu <dxu@dxuuu.xyz>,
	Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Vadim Fedorenko <vadfed@meta.com>
Subject: Re: [PATCH dwarves v1] btf_encoder: handle .BTF_ids section endianness when cross-compiling
Date: Fri, 22 Nov 2024 16:11:01 +0100	[thread overview]
Message-ID: <Z0CfBQR8zxgJv_AP@krava> (raw)
In-Reply-To: <20241122070218.3832680-1-eddyz87@gmail.com>

On Thu, Nov 21, 2024 at 11:02:18PM -0800, Eduard Zingerman wrote:
> btf_encoder__tag_kfuncs() reads .BTF_ids section to identify a set of
> kfuncs present in the ELF being processed. This section consists of
> records of the following shape:
> 
>   struct btf_id_and_flag {
>       uint32_t id;
>       uint32_t flags;
>   };

it contains pairs like above and also just id arrays with no flags, but
that does not matter for the patch functionality, because you swap by
u32 values anyway

> 
> When endianness of binary operated by pahole differs from the
> host endianness these fields require byte swap before using.
> 
> At the moment such byte swap does not happen and kfuncs are not marked
> with decl tags when e.g. s390 kernel is compiled on x86.
> To reproduces the bug:
> - follow instructions from [0] to build an s390 vmlinux;
> - execute:
>   pahole --btf_features_strict=decl_tag_kfuncs,decl_tag \
>          --btf_encode_detached=test.btf vmlinux
> - observe no kfuncs generated:
>   bpftool btf dump test.btf format c | grep __ksym
> 
> This commit fixes the issue by adding an endianness conversion step
> for .BTF_ids section data before main processing step, modifying the
> Elf_Data object in-place.
> The choice is such in order to:
> - minimize changes;
> - keep using Elf_Data, as it provides fields {d_size,d_off} used
>   by kfunc processing routines;
> - avoid sprinkling bswap_32 at each 'struct btf_id_and_flag' field
>   access in fear of forgetting to add new ones when code is modified.

lgtm, some questions below

> 
> [0] https://docs.kernel.org/bpf/s390.html
> 
> Cc: Alan Maguire <alan.maguire@oracle.com>
> Cc: Daniel Xu <dxu@dxuuu.xyz>
> Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> Cc: Vadim Fedorenko <vadfed@meta.com>
> Fixes: 72e88f29c6f7 ("pahole: Inject kfunc decl tags into BTF")
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> ---
>  btf_encoder.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>  lib/bpf       |  2 +-
>  2 files changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/btf_encoder.c b/btf_encoder.c
> index e1adddf..3bdb73b 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -33,6 +33,7 @@
>  #include <stdint.h>
>  #include <search.h> /* for tsearch(), tfind() and tdestroy() */
>  #include <pthread.h>
> +#include <byteswap.h>
>  
>  #define BTF_IDS_SECTION		".BTF_ids"
>  #define BTF_ID_FUNC_PFX		"__BTF_ID__func__"
> @@ -1847,11 +1848,47 @@ static int btf_encoder__tag_kfunc(struct btf_encoder *encoder, struct gobuffer *
>  	return 0;
>  }
>  
> +/* If byte order of 'elf' differs from current byte order, convert the data->d_buf.
> + * ELF file is opened in a readonly mode, so data->d_buf cannot be modified in place.
> + * Instead, allocate a new buffer if modification is necessary.
> + */
> +static int convert_idlist_endianness(Elf *elf, Elf_Data *data, bool *copied)
> +{
> +	int byteorder, i;
> +	char *elf_ident;
> +	uint32_t *tmp;
> +
> +	*copied = false;
> +	elf_ident = elf_getident(elf, NULL);
> +	if (elf_ident == NULL) {
> +		fprintf(stderr, "Cannot get ELF identification from header\n");
> +		return -EINVAL;
> +	}
> +	byteorder = elf_ident[EI_DATA];
> +	if ((BYTE_ORDER == LITTLE_ENDIAN && byteorder == ELFDATA2LSB)
> +	    || (BYTE_ORDER == BIG_ENDIAN && byteorder == ELFDATA2MSB))
> +		return 0;
> +	tmp = malloc(data->d_size);
> +	if (tmp == NULL) {
> +		fprintf(stderr, "Cannot allocate %lu bytes of memory\n", data->d_size);
> +		return -ENOMEM;
> +	}
> +	memcpy(tmp, data->d_buf, data->d_size);
> +	data->d_buf = tmp;

will the original data->d_buf be leaked? are we allowed to assign d_buf like that? ;-)

> +	*copied = true;
> +
> +	/* .BTF_ids sections consist of u32 objects */
> +	for (i = 0; i < data->d_size / sizeof(uint32_t); i++)
> +		tmp[i] = bswap_32(tmp[i]);
> +	return 0;
> +}
> +
>  static int btf_encoder__tag_kfuncs(struct btf_encoder *encoder)
>  {
>  	const char *filename = encoder->source_filename;
>  	struct gobuffer btf_kfunc_ranges = {};
>  	struct gobuffer btf_funcs = {};
> +	bool free_idlist = false;
>  	Elf_Data *symbols = NULL;
>  	Elf_Data *idlist = NULL;
>  	Elf_Scn *symscn = NULL;
> @@ -1919,6 +1956,9 @@ static int btf_encoder__tag_kfuncs(struct btf_encoder *encoder)
>  			idlist_shndx = i;
>  			idlist_addr = shdr.sh_addr;
>  			idlist = data;
> +			err = convert_idlist_endianness(elf, idlist, &free_idlist);
> +			if (err < 0)
> +				goto out;
>  		}
>  	}
>  
> @@ -2031,6 +2071,8 @@ static int btf_encoder__tag_kfuncs(struct btf_encoder *encoder)
>  out:
>  	__gobuffer__delete(&btf_funcs);
>  	__gobuffer__delete(&btf_kfunc_ranges);
> +	if (free_idlist)
> +		free(idlist->d_buf);
>  	if (elf)
>  		elf_end(elf);

curious, would elf_end try to free the d_buf at some point?

>  	if (fd != -1)
> diff --git a/lib/bpf b/lib/bpf
> index 09b9e83..caa17bd 160000
> --- a/lib/bpf
> +++ b/lib/bpf
> @@ -1 +1 @@
> -Subproject commit 09b9e83102eb8ab9e540d36b4559c55f3bcdb95d
> +Subproject commit caa17bdcbfc58e68eaf4d017c058e6577606bf56

I think this should not be part of the patch

thanks,
jirka

next prev parent reply	other threads:[~2024-11-22 15:11 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-22  7:02 [PATCH dwarves v1] btf_encoder: handle .BTF_ids section endianness when cross-compiling Eduard Zingerman
2024-11-22 15:03 ` Vadim Fedorenko
2024-11-22 15:11 ` Jiri Olsa [this message]
2024-11-22 18:08   ` Eduard Zingerman
2024-11-22 18:16     ` Eduard Zingerman
2024-11-26 19:26 ` Andrii Nakryiko
2024-11-26 19:31   ` Eduard Zingerman
2024-11-26 21:51     ` Andrii Nakryiko
2024-11-27  0:30       ` Eduard Zingerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z0CfBQR8zxgJv_AP@krava \
    --to=olsajiri@gmail.com \
    --cc=alan.maguire@oracle.com \
    --cc=andrii@kernel.org \
    --cc=arnaldo.melo@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dwarves@vger.kernel.org \
    --cc=dxu@dxuuu.xyz \
    --cc=eddyz87@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=memxor@gmail.com \
    --cc=vadfed@meta.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox