Re: [PATCH bpf-next 2/2] perf: stop using deprecated bpf__object_next() API

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Jiri Olsa <jolsa@redhat.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Christy Lee <christylee@fb.com>,
	Christy Lee <christyc.y.lee@gmail.com>, bpf <bpf@vger.kernel.org>,
	"linux-perf-use." <linux-perf-users@vger.kernel.org>,
	Kernel Team <kernel-team@fb.com>, He Kuang <hekuang@huawei.com>,
	Wang Nan <wangnan0@huawei.com>,
	Wang ShaoBo <bobo.shaobowang@huawei.com>,
	YueHaibing <yuehaibing@huawei.com>
Subject: Re: [PATCH bpf-next 2/2] perf: stop using deprecated bpf__object_next() API
Date: Sat, 22 Jan 2022 17:29:56 -0300	[thread overview]
Message-ID: <YexpRMs2jL+jH83e@kernel.org> (raw)
In-Reply-To: <YeU2J91BQI8ig1TV@krava>

Em Mon, Jan 17, 2022 at 10:25:59AM +0100, Jiri Olsa escreveu:
> On Fri, Jan 14, 2022 at 01:00:45PM -0800, Andrii Nakryiko wrote:
> > On Thu, Jan 13, 2022 at 7:14 AM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > On Thu, Jan 06, 2022 at 09:54:38AM -0800, Christy Lee wrote:
> > > > Thank you so much, I was able to reproduce the original tests after applying
> > > > the bug fix. I will submit a new patch set with the more detailed comments.
> > > >
> > > > The only deprecated functions that need to be removed after this would be
> > > > bpf_program__set_prep() (how perf sets the bpf prologue) and
> > > > bpf_program__nth_fd() (how perf leverages multi instance bpf). They look a
> > > > little more involved and I'm not sure how to approach those. Jiri, would you
> > > > mind taking a look at those please?
> > >
> > > hi,
> > > I checked and here's the way perf uses this interface:
> > >
> > >   - when bpf object/sources is specified on perf command line
> > >     we use bpf_object__open to load it
> > >
> > >   - user can define parameters in the section name for each bpf program
> > >     like:
> > >
> > >       SEC("lock_page=__lock_page page->flags")
> > >       int lock_page(struct pt_regs *ctx, int err, unsigned long flags)
> > >       {
> > >              return 1;
> > >       }
> > >
> > >     which tells perf to 'prepare' some extra bpf code for the program,
> > >     like to put value of 'page->flags' into 'flags' argument above
> > >
> > >   - perf generates extra prologue code to retrieve this data and does
> > >     that before the program is loaded by using bpf_program__set_prep
> > >     callback
> > >
> > >   - now the reason why we use bpf_program__set_prep for that, is because
> > >     it allows to create multiple instances for one bpf program
> > >
> > >   - we need multiple instances for single program, because probe can
> > >     result in multiple attach addresses (like for inlined functions)
> > >     with possible different ways of getting the arguments we need
> > >     to load
> > >
> > > I guess you want to get rid of that whole 'struct instances' related
> > > stuff, is that right?
> > >
> > > perf would need to load all the needed instances for program manually
> > > and somehow bypass/workaround bpf_object__load.. is there a way to
> > > manually add extra programs to bpf_object?
> > >
> > > thoughts? ;-)
> > 
> > Sigh..
> > 
> > 1. SEC("lock_page=__lock_page page->flags") will break in libbpf 1.0.
> > I'm going to add a way to provide a custom callback to handle such BPF
> > program sections by your custom code, but... Who's using this? Is
> > anyone using this? How is this used and for what? Would it be possible
> > to just kill this feature?
> 
> good question ;-) IMO it was added in the early ebpf times, when nobody
> knew what will become the preferred way of doing things
> 
> we don't know if there are any users of this, but:
> 
> I had to go through the code to find out how to use it and it was broken
> in perf trace for some time while nobody complained ;-) also I don't think
> this is advertised anywhere in the doc
> 
> Arnaldo,
> thoughts on removing this? ;-) I tried with the quick patch below, and
> the standard perf trace ebpf support won't be affected by this
> 
> the patch is removing the support for generating the ebpf program prologue
> which includes the usage of libbpf's instances APIs
> 
> we could also remove the special section config parsing, which is used
> by prologue generation code

This was all done a long time ago, mostly by Wang Nan, so if you tested
it based on the committer testing comments, etc, and everything seems to
work...

I'll try and give it a go after pushing the current lot to Linus.

- Arnaldo
 
> jirka
> 
> 
> ---
> diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
> index 96ad944ca6a8..d9ff537d999e 100644
> --- a/tools/perf/Makefile.config
> +++ b/tools/perf/Makefile.config
> @@ -556,17 +556,6 @@ ifndef NO_LIBELF
>        endif
>      endif
>  
> -    ifndef NO_DWARF
> -      ifdef PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET
> -        CFLAGS += -DHAVE_BPF_PROLOGUE
> -        $(call detected,CONFIG_BPF_PROLOGUE)
> -      else
> -        msg := $(warning BPF prologue is not supported by architecture $(SRCARCH), missing regs_query_register_offset());
> -      endif
> -    else
> -      msg := $(warning DWARF support is off, BPF prologue is disabled);
> -    endif
> -
>    endif # NO_LIBBPF
>  endif # NO_LIBELF
>  
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 6ac2160913ea..a04c02aed4c7 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -2685,20 +2685,6 @@ int cmd_record(int argc, const char **argv)
>  	set_nobuild('\0', "clang-path", true);
>  	set_nobuild('\0', "clang-opt", true);
>  # undef set_nobuild
> -#endif
> -
> -#ifndef HAVE_BPF_PROLOGUE
> -# if !defined (HAVE_DWARF_SUPPORT)
> -#  define REASON  "NO_DWARF=1"
> -# elif !defined (HAVE_LIBBPF_SUPPORT)
> -#  define REASON  "NO_LIBBPF=1"
> -# else
> -#  define REASON  "this architecture doesn't support BPF prologue"
> -# endif
> -# define set_nobuild(s, l, c) set_option_nobuild(record_options, s, l, REASON, c)
> -	set_nobuild('\0', "vmlinux", true);
> -# undef set_nobuild
> -# undef REASON
>  #endif
>  
>  	rec->opts.affinity = PERF_AFFINITY_SYS;
> diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
> index 22662fc85cc9..0e3f24dfee2c 100644
> --- a/tools/perf/util/bpf-loader.c
> +++ b/tools/perf/util/bpf-loader.c
> @@ -40,10 +40,6 @@ struct bpf_prog_priv {
>  	char *sys_name;
>  	char *evt_name;
>  	struct perf_probe_event pev;
> -	bool need_prologue;
> -	struct bpf_insn *insns_buf;
> -	int nr_types;
> -	int *type_mapping;
>  };
>  
>  static bool libbpf_initialized;
> @@ -125,8 +121,6 @@ clear_prog_priv(struct bpf_program *prog __maybe_unused,
>  	struct bpf_prog_priv *priv = _priv;
>  
>  	cleanup_perf_probe_events(&priv->pev, 1);
> -	zfree(&priv->insns_buf);
> -	zfree(&priv->type_mapping);
>  	zfree(&priv->sys_name);
>  	zfree(&priv->evt_name);
>  	free(priv);
> @@ -409,220 +403,6 @@ static int bpf__prepare_probe(void)
>  	return err;
>  }
>  
> -static int
> -preproc_gen_prologue(struct bpf_program *prog, int n,
> -		     struct bpf_insn *orig_insns, int orig_insns_cnt,
> -		     struct bpf_prog_prep_result *res)
> -{
> -	struct bpf_prog_priv *priv = bpf_program__priv(prog);
> -	struct probe_trace_event *tev;
> -	struct perf_probe_event *pev;
> -	struct bpf_insn *buf;
> -	size_t prologue_cnt = 0;
> -	int i, err;
> -
> -	if (IS_ERR_OR_NULL(priv) || priv->is_tp)
> -		goto errout;
> -
> -	pev = &priv->pev;
> -
> -	if (n < 0 || n >= priv->nr_types)
> -		goto errout;
> -
> -	/* Find a tev belongs to that type */
> -	for (i = 0; i < pev->ntevs; i++) {
> -		if (priv->type_mapping[i] == n)
> -			break;
> -	}
> -
> -	if (i >= pev->ntevs) {
> -		pr_debug("Internal error: prologue type %d not found\n", n);
> -		return -BPF_LOADER_ERRNO__PROLOGUE;
> -	}
> -
> -	tev = &pev->tevs[i];
> -
> -	buf = priv->insns_buf;
> -	err = bpf__gen_prologue(tev->args, tev->nargs,
> -				buf, &prologue_cnt,
> -				BPF_MAXINSNS - orig_insns_cnt);
> -	if (err) {
> -		const char *title;
> -
> -		title = bpf_program__section_name(prog);
> -		pr_debug("Failed to generate prologue for program %s\n",
> -			 title);
> -		return err;
> -	}
> -
> -	memcpy(&buf[prologue_cnt], orig_insns,
> -	       sizeof(struct bpf_insn) * orig_insns_cnt);
> -
> -	res->new_insn_ptr = buf;
> -	res->new_insn_cnt = prologue_cnt + orig_insns_cnt;
> -	res->pfd = NULL;
> -	return 0;
> -
> -errout:
> -	pr_debug("Internal error in preproc_gen_prologue\n");
> -	return -BPF_LOADER_ERRNO__PROLOGUE;
> -}
> -
> -/*
> - * compare_tev_args is reflexive, transitive and antisymmetric.
> - * I can proof it but this margin is too narrow to contain.
> - */
> -static int compare_tev_args(const void *ptev1, const void *ptev2)
> -{
> -	int i, ret;
> -	const struct probe_trace_event *tev1 =
> -		*(const struct probe_trace_event **)ptev1;
> -	const struct probe_trace_event *tev2 =
> -		*(const struct probe_trace_event **)ptev2;
> -
> -	ret = tev2->nargs - tev1->nargs;
> -	if (ret)
> -		return ret;
> -
> -	for (i = 0; i < tev1->nargs; i++) {
> -		struct probe_trace_arg *arg1, *arg2;
> -		struct probe_trace_arg_ref *ref1, *ref2;
> -
> -		arg1 = &tev1->args[i];
> -		arg2 = &tev2->args[i];
> -
> -		ret = strcmp(arg1->value, arg2->value);
> -		if (ret)
> -			return ret;
> -
> -		ref1 = arg1->ref;
> -		ref2 = arg2->ref;
> -
> -		while (ref1 && ref2) {
> -			ret = ref2->offset - ref1->offset;
> -			if (ret)
> -				return ret;
> -
> -			ref1 = ref1->next;
> -			ref2 = ref2->next;
> -		}
> -
> -		if (ref1 || ref2)
> -			return ref2 ? 1 : -1;
> -	}
> -
> -	return 0;
> -}
> -
> -/*
> - * Assign a type number to each tevs in a pev.
> - * mapping is an array with same slots as tevs in that pev.
> - * nr_types will be set to number of types.
> - */
> -static int map_prologue(struct perf_probe_event *pev, int *mapping,
> -			int *nr_types)
> -{
> -	int i, type = 0;
> -	struct probe_trace_event **ptevs;
> -
> -	size_t array_sz = sizeof(*ptevs) * pev->ntevs;
> -
> -	ptevs = malloc(array_sz);
> -	if (!ptevs) {
> -		pr_debug("Not enough memory: alloc ptevs failed\n");
> -		return -ENOMEM;
> -	}
> -
> -	pr_debug("In map_prologue, ntevs=%d\n", pev->ntevs);
> -	for (i = 0; i < pev->ntevs; i++)
> -		ptevs[i] = &pev->tevs[i];
> -
> -	qsort(ptevs, pev->ntevs, sizeof(*ptevs),
> -	      compare_tev_args);
> -
> -	for (i = 0; i < pev->ntevs; i++) {
> -		int n;
> -
> -		n = ptevs[i] - pev->tevs;
> -		if (i == 0) {
> -			mapping[n] = type;
> -			pr_debug("mapping[%d]=%d\n", n, type);
> -			continue;
> -		}
> -
> -		if (compare_tev_args(ptevs + i, ptevs + i - 1) == 0)
> -			mapping[n] = type;
> -		else
> -			mapping[n] = ++type;
> -
> -		pr_debug("mapping[%d]=%d\n", n, mapping[n]);
> -	}
> -	free(ptevs);
> -	*nr_types = type + 1;
> -
> -	return 0;
> -}
> -
> -static int hook_load_preprocessor(struct bpf_program *prog)
> -{
> -	struct bpf_prog_priv *priv = bpf_program__priv(prog);
> -	struct perf_probe_event *pev;
> -	bool need_prologue = false;
> -	int err, i;
> -
> -	if (IS_ERR_OR_NULL(priv)) {
> -		pr_debug("Internal error when hook preprocessor\n");
> -		return -BPF_LOADER_ERRNO__INTERNAL;
> -	}
> -
> -	if (priv->is_tp) {
> -		priv->need_prologue = false;
> -		return 0;
> -	}
> -
> -	pev = &priv->pev;
> -	for (i = 0; i < pev->ntevs; i++) {
> -		struct probe_trace_event *tev = &pev->tevs[i];
> -
> -		if (tev->nargs > 0) {
> -			need_prologue = true;
> -			break;
> -		}
> -	}
> -
> -	/*
> -	 * Since all tevs don't have argument, we don't need generate
> -	 * prologue.
> -	 */
> -	if (!need_prologue) {
> -		priv->need_prologue = false;
> -		return 0;
> -	}
> -
> -	priv->need_prologue = true;
> -	priv->insns_buf = malloc(sizeof(struct bpf_insn) * BPF_MAXINSNS);
> -	if (!priv->insns_buf) {
> -		pr_debug("Not enough memory: alloc insns_buf failed\n");
> -		return -ENOMEM;
> -	}
> -
> -	priv->type_mapping = malloc(sizeof(int) * pev->ntevs);
> -	if (!priv->type_mapping) {
> -		pr_debug("Not enough memory: alloc type_mapping failed\n");
> -		return -ENOMEM;
> -	}
> -	memset(priv->type_mapping, -1,
> -	       sizeof(int) * pev->ntevs);
> -
> -	err = map_prologue(pev, priv->type_mapping, &priv->nr_types);
> -	if (err)
> -		return err;
> -
> -	err = bpf_program__set_prep(prog, priv->nr_types,
> -				    preproc_gen_prologue);
> -	return err;
> -}
> -
>  int bpf__probe(struct bpf_object *obj)
>  {
>  	int err = 0;
> @@ -669,18 +449,6 @@ int bpf__probe(struct bpf_object *obj)
>  			pr_debug("bpf_probe: failed to apply perf probe events\n");
>  			goto out;
>  		}
> -
> -		/*
> -		 * After probing, let's consider prologue, which
> -		 * adds program fetcher to BPF programs.
> -		 *
> -		 * hook_load_preprocessor() hooks pre-processor
> -		 * to bpf_program, let it generate prologue
> -		 * dynamically during loading.
> -		 */
> -		err = hook_load_preprocessor(prog);
> -		if (err)
> -			goto out;
>  	}
>  out:
>  	return err < 0 ? err : 0;
> @@ -773,14 +541,7 @@ int bpf__foreach_event(struct bpf_object *obj,
>  		for (i = 0; i < pev->ntevs; i++) {
>  			tev = &pev->tevs[i];
>  
> -			if (priv->need_prologue) {
> -				int type = priv->type_mapping[i];
> -
> -				fd = bpf_program__nth_fd(prog, type);
> -			} else {
> -				fd = bpf_program__fd(prog);
> -			}
> -
> +			fd = bpf_program__fd(prog);
>  			if (fd < 0) {
>  				pr_debug("bpf: failed to get file descriptor\n");
>  				return fd;
> diff --git a/tools/perf/util/bpf-prologue.c b/tools/perf/util/bpf-prologue.c
> deleted file mode 100644
> index 9887ae09242d..000000000000
> --- a/tools/perf/util/bpf-prologue.c
> +++ /dev/null
> @@ -1,508 +0,0 @@
> -// SPDX-License-Identifier: GPL-2.0
> -/*
> - * bpf-prologue.c
> - *
> - * Copyright (C) 2015 He Kuang <hekuang@huawei.com>
> - * Copyright (C) 2015 Wang Nan <wangnan0@huawei.com>
> - * Copyright (C) 2015 Huawei Inc.
> - */
> -
> -#include <bpf/libbpf.h>
> -#include "debug.h"
> -#include "bpf-loader.h"
> -#include "bpf-prologue.h"
> -#include "probe-finder.h"
> -#include <errno.h>
> -#include <stdlib.h>
> -#include <dwarf-regs.h>
> -#include <linux/filter.h>
> -
> -#define BPF_REG_SIZE		8
> -
> -#define JMP_TO_ERROR_CODE	-1
> -#define JMP_TO_SUCCESS_CODE	-2
> -#define JMP_TO_USER_CODE	-3
> -
> -struct bpf_insn_pos {
> -	struct bpf_insn *begin;
> -	struct bpf_insn *end;
> -	struct bpf_insn *pos;
> -};
> -
> -static inline int
> -pos_get_cnt(struct bpf_insn_pos *pos)
> -{
> -	return pos->pos - pos->begin;
> -}
> -
> -static int
> -append_insn(struct bpf_insn new_insn, struct bpf_insn_pos *pos)
> -{
> -	if (!pos->pos)
> -		return -BPF_LOADER_ERRNO__PROLOGUE2BIG;
> -
> -	if (pos->pos + 1 >= pos->end) {
> -		pr_err("bpf prologue: prologue too long\n");
> -		pos->pos = NULL;
> -		return -BPF_LOADER_ERRNO__PROLOGUE2BIG;
> -	}
> -
> -	*(pos->pos)++ = new_insn;
> -	return 0;
> -}
> -
> -static int
> -check_pos(struct bpf_insn_pos *pos)
> -{
> -	if (!pos->pos || pos->pos >= pos->end)
> -		return -BPF_LOADER_ERRNO__PROLOGUE2BIG;
> -	return 0;
> -}
> -
> -/*
> - * Convert type string (u8/u16/u32/u64/s8/s16/s32/s64 ..., see
> - * Documentation/trace/kprobetrace.rst) to size field of BPF_LDX_MEM
> - * instruction (BPF_{B,H,W,DW}).
> - */
> -static int
> -argtype_to_ldx_size(const char *type)
> -{
> -	int arg_size = type ? atoi(&type[1]) : 64;
> -
> -	switch (arg_size) {
> -	case 8:
> -		return BPF_B;
> -	case 16:
> -		return BPF_H;
> -	case 32:
> -		return BPF_W;
> -	case 64:
> -	default:
> -		return BPF_DW;
> -	}
> -}
> -
> -static const char *
> -insn_sz_to_str(int insn_sz)
> -{
> -	switch (insn_sz) {
> -	case BPF_B:
> -		return "BPF_B";
> -	case BPF_H:
> -		return "BPF_H";
> -	case BPF_W:
> -		return "BPF_W";
> -	case BPF_DW:
> -		return "BPF_DW";
> -	default:
> -		return "UNKNOWN";
> -	}
> -}
> -
> -/* Give it a shorter name */
> -#define ins(i, p) append_insn((i), (p))
> -
> -/*
> - * Give a register name (in 'reg'), generate instruction to
> - * load register into an eBPF register rd:
> - *   'ldd target_reg, offset(ctx_reg)', where:
> - * ctx_reg is pre initialized to pointer of 'struct pt_regs'.
> - */
> -static int
> -gen_ldx_reg_from_ctx(struct bpf_insn_pos *pos, int ctx_reg,
> -		     const char *reg, int target_reg)
> -{
> -	int offset = regs_query_register_offset(reg);
> -
> -	if (offset < 0) {
> -		pr_err("bpf: prologue: failed to get register %s\n",
> -		       reg);
> -		return offset;
> -	}
> -	ins(BPF_LDX_MEM(BPF_DW, target_reg, ctx_reg, offset), pos);
> -
> -	return check_pos(pos);
> -}
> -
> -/*
> - * Generate a BPF_FUNC_probe_read function call.
> - *
> - * src_base_addr_reg is a register holding base address,
> - * dst_addr_reg is a register holding dest address (on stack),
> - * result is:
> - *
> - *  *[dst_addr_reg] = *([src_base_addr_reg] + offset)
> - *
> - * Arguments of BPF_FUNC_probe_read:
> - *     ARG1: ptr to stack (dest)
> - *     ARG2: size (8)
> - *     ARG3: unsafe ptr (src)
> - */
> -static int
> -gen_read_mem(struct bpf_insn_pos *pos,
> -	     int src_base_addr_reg,
> -	     int dst_addr_reg,
> -	     long offset,
> -	     int probeid)
> -{
> -	/* mov arg3, src_base_addr_reg */
> -	if (src_base_addr_reg != BPF_REG_ARG3)
> -		ins(BPF_MOV64_REG(BPF_REG_ARG3, src_base_addr_reg), pos);
> -	/* add arg3, #offset */
> -	if (offset)
> -		ins(BPF_ALU64_IMM(BPF_ADD, BPF_REG_ARG3, offset), pos);
> -
> -	/* mov arg2, #reg_size */
> -	ins(BPF_ALU64_IMM(BPF_MOV, BPF_REG_ARG2, BPF_REG_SIZE), pos);
> -
> -	/* mov arg1, dst_addr_reg */
> -	if (dst_addr_reg != BPF_REG_ARG1)
> -		ins(BPF_MOV64_REG(BPF_REG_ARG1, dst_addr_reg), pos);
> -
> -	/* Call probe_read  */
> -	ins(BPF_EMIT_CALL(probeid), pos);
> -	/*
> -	 * Error processing: if read fail, goto error code,
> -	 * will be relocated. Target should be the start of
> -	 * error processing code.
> -	 */
> -	ins(BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, JMP_TO_ERROR_CODE),
> -	    pos);
> -
> -	return check_pos(pos);
> -}
> -
> -/*
> - * Each arg should be bare register. Fetch and save them into argument
> - * registers (r3 - r5).
> - *
> - * BPF_REG_1 should have been initialized with pointer to
> - * 'struct pt_regs'.
> - */
> -static int
> -gen_prologue_fastpath(struct bpf_insn_pos *pos,
> -		      struct probe_trace_arg *args, int nargs)
> -{
> -	int i, err = 0;
> -
> -	for (i = 0; i < nargs; i++) {
> -		err = gen_ldx_reg_from_ctx(pos, BPF_REG_1, args[i].value,
> -					   BPF_PROLOGUE_START_ARG_REG + i);
> -		if (err)
> -			goto errout;
> -	}
> -
> -	return check_pos(pos);
> -errout:
> -	return err;
> -}
> -
> -/*
> - * Slow path:
> - *   At least one argument has the form of 'offset($rx)'.
> - *
> - * Following code first stores them into stack, then loads all of then
> - * to r2 - r5.
> - * Before final loading, the final result should be:
> - *
> - * low address
> - * BPF_REG_FP - 24  ARG3
> - * BPF_REG_FP - 16  ARG2
> - * BPF_REG_FP - 8   ARG1
> - * BPF_REG_FP
> - * high address
> - *
> - * For each argument (described as: offn(...off2(off1(reg)))),
> - * generates following code:
> - *
> - *  r7 <- fp
> - *  r7 <- r7 - stack_offset  // Ideal code should initialize r7 using
> - *                           // fp before generating args. However,
> - *                           // eBPF won't regard r7 as stack pointer
> - *                           // if it is generated by minus 8 from
> - *                           // another stack pointer except fp.
> - *                           // This is why we have to set r7
> - *                           // to fp for each variable.
> - *  r3 <- value of 'reg'-> generated using gen_ldx_reg_from_ctx()
> - *  (r7) <- r3       // skip following instructions for bare reg
> - *  r3 <- r3 + off1  . // skip if off1 == 0
> - *  r2 <- 8           \
> - *  r1 <- r7           |-> generated by gen_read_mem()
> - *  call probe_read    /
> - *  jnei r0, 0, err  ./
> - *  r3 <- (r7)
> - *  r3 <- r3 + off2  . // skip if off2 == 0
> - *  r2 <- 8           \  // r2 may be broken by probe_read, so set again
> - *  r1 <- r7           |-> generated by gen_read_mem()
> - *  call probe_read    /
> - *  jnei r0, 0, err  ./
> - *  ...
> - */
> -static int
> -gen_prologue_slowpath(struct bpf_insn_pos *pos,
> -		      struct probe_trace_arg *args, int nargs)
> -{
> -	int err, i, probeid;
> -
> -	for (i = 0; i < nargs; i++) {
> -		struct probe_trace_arg *arg = &args[i];
> -		const char *reg = arg->value;
> -		struct probe_trace_arg_ref *ref = NULL;
> -		int stack_offset = (i + 1) * -8;
> -
> -		pr_debug("prologue: fetch arg %d, base reg is %s\n",
> -			 i, reg);
> -
> -		/* value of base register is stored into ARG3 */
> -		err = gen_ldx_reg_from_ctx(pos, BPF_REG_CTX, reg,
> -					   BPF_REG_ARG3);
> -		if (err) {
> -			pr_err("prologue: failed to get offset of register %s\n",
> -			       reg);
> -			goto errout;
> -		}
> -
> -		/* Make r7 the stack pointer. */
> -		ins(BPF_MOV64_REG(BPF_REG_7, BPF_REG_FP), pos);
> -		/* r7 += -8 */
> -		ins(BPF_ALU64_IMM(BPF_ADD, BPF_REG_7, stack_offset), pos);
> -		/*
> -		 * Store r3 (base register) onto stack
> -		 * Ensure fp[offset] is set.
> -		 * fp is the only valid base register when storing
> -		 * into stack. We are not allowed to use r7 as base
> -		 * register here.
> -		 */
> -		ins(BPF_STX_MEM(BPF_DW, BPF_REG_FP, BPF_REG_ARG3,
> -				stack_offset), pos);
> -
> -		ref = arg->ref;
> -		probeid = BPF_FUNC_probe_read_kernel;
> -		while (ref) {
> -			pr_debug("prologue: arg %d: offset %ld\n",
> -				 i, ref->offset);
> -
> -			if (ref->user_access)
> -				probeid = BPF_FUNC_probe_read_user;
> -
> -			err = gen_read_mem(pos, BPF_REG_3, BPF_REG_7,
> -					   ref->offset, probeid);
> -			if (err) {
> -				pr_err("prologue: failed to generate probe_read function call\n");
> -				goto errout;
> -			}
> -
> -			ref = ref->next;
> -			/*
> -			 * Load previous result into ARG3. Use
> -			 * BPF_REG_FP instead of r7 because verifier
> -			 * allows FP based addressing only.
> -			 */
> -			if (ref)
> -				ins(BPF_LDX_MEM(BPF_DW, BPF_REG_ARG3,
> -						BPF_REG_FP, stack_offset), pos);
> -		}
> -	}
> -
> -	/* Final pass: read to registers */
> -	for (i = 0; i < nargs; i++) {
> -		int insn_sz = (args[i].ref) ? argtype_to_ldx_size(args[i].type) : BPF_DW;
> -
> -		pr_debug("prologue: load arg %d, insn_sz is %s\n",
> -			 i, insn_sz_to_str(insn_sz));
> -		ins(BPF_LDX_MEM(insn_sz, BPF_PROLOGUE_START_ARG_REG + i,
> -				BPF_REG_FP, -BPF_REG_SIZE * (i + 1)), pos);
> -	}
> -
> -	ins(BPF_JMP_IMM(BPF_JA, BPF_REG_0, 0, JMP_TO_SUCCESS_CODE), pos);
> -
> -	return check_pos(pos);
> -errout:
> -	return err;
> -}
> -
> -static int
> -prologue_relocate(struct bpf_insn_pos *pos, struct bpf_insn *error_code,
> -		  struct bpf_insn *success_code, struct bpf_insn *user_code)
> -{
> -	struct bpf_insn *insn;
> -
> -	if (check_pos(pos))
> -		return -BPF_LOADER_ERRNO__PROLOGUE2BIG;
> -
> -	for (insn = pos->begin; insn < pos->pos; insn++) {
> -		struct bpf_insn *target;
> -		u8 class = BPF_CLASS(insn->code);
> -		u8 opcode;
> -
> -		if (class != BPF_JMP)
> -			continue;
> -		opcode = BPF_OP(insn->code);
> -		if (opcode == BPF_CALL)
> -			continue;
> -
> -		switch (insn->off) {
> -		case JMP_TO_ERROR_CODE:
> -			target = error_code;
> -			break;
> -		case JMP_TO_SUCCESS_CODE:
> -			target = success_code;
> -			break;
> -		case JMP_TO_USER_CODE:
> -			target = user_code;
> -			break;
> -		default:
> -			pr_err("bpf prologue: internal error: relocation failed\n");
> -			return -BPF_LOADER_ERRNO__PROLOGUE;
> -		}
> -
> -		insn->off = target - (insn + 1);
> -	}
> -	return 0;
> -}
> -
> -int bpf__gen_prologue(struct probe_trace_arg *args, int nargs,
> -		      struct bpf_insn *new_prog, size_t *new_cnt,
> -		      size_t cnt_space)
> -{
> -	struct bpf_insn *success_code = NULL;
> -	struct bpf_insn *error_code = NULL;
> -	struct bpf_insn *user_code = NULL;
> -	struct bpf_insn_pos pos;
> -	bool fastpath = true;
> -	int err = 0, i;
> -
> -	if (!new_prog || !new_cnt)
> -		return -EINVAL;
> -
> -	if (cnt_space > BPF_MAXINSNS)
> -		cnt_space = BPF_MAXINSNS;
> -
> -	pos.begin = new_prog;
> -	pos.end = new_prog + cnt_space;
> -	pos.pos = new_prog;
> -
> -	if (!nargs) {
> -		ins(BPF_ALU64_IMM(BPF_MOV, BPF_PROLOGUE_FETCH_RESULT_REG, 0),
> -		    &pos);
> -
> -		if (check_pos(&pos))
> -			goto errout;
> -
> -		*new_cnt = pos_get_cnt(&pos);
> -		return 0;
> -	}
> -
> -	if (nargs > BPF_PROLOGUE_MAX_ARGS) {
> -		pr_warning("bpf: prologue: %d arguments are dropped\n",
> -			   nargs - BPF_PROLOGUE_MAX_ARGS);
> -		nargs = BPF_PROLOGUE_MAX_ARGS;
> -	}
> -
> -	/* First pass: validation */
> -	for (i = 0; i < nargs; i++) {
> -		struct probe_trace_arg_ref *ref = args[i].ref;
> -
> -		if (args[i].value[0] == '@') {
> -			/* TODO: fetch global variable */
> -			pr_err("bpf: prologue: global %s%+ld not support\n",
> -				args[i].value, ref ? ref->offset : 0);
> -			return -ENOTSUP;
> -		}
> -
> -		while (ref) {
> -			/* fastpath is true if all args has ref == NULL */
> -			fastpath = false;
> -
> -			/*
> -			 * Instruction encodes immediate value using
> -			 * s32, ref->offset is long. On systems which
> -			 * can't fill long in s32, refuse to process if
> -			 * ref->offset too large (or small).
> -			 */
> -#ifdef __LP64__
> -#define OFFSET_MAX	((1LL << 31) - 1)
> -#define OFFSET_MIN	((1LL << 31) * -1)
> -			if (ref->offset > OFFSET_MAX ||
> -					ref->offset < OFFSET_MIN) {
> -				pr_err("bpf: prologue: offset out of bound: %ld\n",
> -				       ref->offset);
> -				return -BPF_LOADER_ERRNO__PROLOGUEOOB;
> -			}
> -#endif
> -			ref = ref->next;
> -		}
> -	}
> -	pr_debug("prologue: pass validation\n");
> -
> -	if (fastpath) {
> -		/* If all variables are registers... */
> -		pr_debug("prologue: fast path\n");
> -		err = gen_prologue_fastpath(&pos, args, nargs);
> -		if (err)
> -			goto errout;
> -	} else {
> -		pr_debug("prologue: slow path\n");
> -
> -		/* Initialization: move ctx to a callee saved register. */
> -		ins(BPF_MOV64_REG(BPF_REG_CTX, BPF_REG_ARG1), &pos);
> -
> -		err = gen_prologue_slowpath(&pos, args, nargs);
> -		if (err)
> -			goto errout;
> -		/*
> -		 * start of ERROR_CODE (only slow pass needs error code)
> -		 *   mov r2 <- 1  // r2 is error number
> -		 *   mov r3 <- 0  // r3, r4... should be touched or
> -		 *                // verifier would complain
> -		 *   mov r4 <- 0
> -		 *   ...
> -		 *   goto usercode
> -		 */
> -		error_code = pos.pos;
> -		ins(BPF_ALU64_IMM(BPF_MOV, BPF_PROLOGUE_FETCH_RESULT_REG, 1),
> -		    &pos);
> -
> -		for (i = 0; i < nargs; i++)
> -			ins(BPF_ALU64_IMM(BPF_MOV,
> -					  BPF_PROLOGUE_START_ARG_REG + i,
> -					  0),
> -			    &pos);
> -		ins(BPF_JMP_IMM(BPF_JA, BPF_REG_0, 0, JMP_TO_USER_CODE),
> -				&pos);
> -	}
> -
> -	/*
> -	 * start of SUCCESS_CODE:
> -	 *   mov r2 <- 0
> -	 *   goto usercode  // skip
> -	 */
> -	success_code = pos.pos;
> -	ins(BPF_ALU64_IMM(BPF_MOV, BPF_PROLOGUE_FETCH_RESULT_REG, 0), &pos);
> -
> -	/*
> -	 * start of USER_CODE:
> -	 *   Restore ctx to r1
> -	 */
> -	user_code = pos.pos;
> -	if (!fastpath) {
> -		/*
> -		 * Only slow path needs restoring of ctx. In fast path,
> -		 * register are loaded directly from r1.
> -		 */
> -		ins(BPF_MOV64_REG(BPF_REG_ARG1, BPF_REG_CTX), &pos);
> -		err = prologue_relocate(&pos, error_code, success_code,
> -					user_code);
> -		if (err)
> -			goto errout;
> -	}
> -
> -	err = check_pos(&pos);
> -	if (err)
> -		goto errout;
> -
> -	*new_cnt = pos_get_cnt(&pos);
> -	return 0;
> -errout:
> -	return err;
> -}

-- 

- Arnaldo

next prev parent reply	other threads:[~2022-01-22 20:32 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-16 22:21 [PATCH bpf-next 0/2] perf: stop using deprecated bpf APIs Christy Lee
2021-12-16 22:21 ` [PATCH bpf-next 1/2] perf: stop using deprecated bpf_prog_load() API Christy Lee
2021-12-16 22:21 ` [PATCH bpf-next 2/2] perf: stop using deprecated bpf__object_next() API Christy Lee
2021-12-21  8:22   ` Jiri Olsa
2021-12-21 21:58     ` Andrii Nakryiko
2021-12-22 13:44       ` Jiri Olsa
2021-12-22 22:17         ` Andrii Nakryiko
2021-12-29 19:01           ` Christy Lee
2022-01-04 14:40             ` Jiri Olsa
2022-01-05 13:49               ` Jiri Olsa
2022-01-06 17:54                 ` Christy Lee
2022-01-06 22:41                   ` Jiri Olsa
2022-01-13 15:14                   ` Jiri Olsa
2022-01-14 21:00                     ` Andrii Nakryiko
2022-01-17  9:25                       ` Jiri Olsa
2022-01-18 23:05                         ` Andrii Nakryiko
2022-01-22 20:29                         ` Arnaldo Carvalho de Melo [this message]
2022-01-06 20:25                 ` Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YexpRMs2jL+jH83e@kernel.org \
    --to=acme@kernel.org \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bobo.shaobowang@huawei.com \
    --cc=bpf@vger.kernel.org \
    --cc=christyc.y.lee@gmail.com \
    --cc=christylee@fb.com \
    --cc=hekuang@huawei.com \
    --cc=jolsa@redhat.com \
    --cc=kernel-team@fb.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=wangnan0@huawei.com \
    --cc=yuehaibing@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).