public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: "Emil Tsalapatis" <emil@etsalapatis.com>
To: "Andrey Grodzovsky" <andrey.grodzovsky@crowdstrike.com>,
	<bpf@vger.kernel.org>
Cc: <ast@kernel.org>, <daniel@iogearbox.net>, <andrii@kernel.org>,
	<jolsa@kernel.org>, <linux-open-source@crowdstrike.com>
Subject: Re: [RFC PATCH bpf-next 1/2] libbpf: Handle duplicate kprobe symbols in attach_kprobe_opts
Date: Wed, 18 Feb 2026 19:28:45 -0500	[thread overview]
Message-ID: <DGIIHMD5LE8T.ADJM2OXAWCGZ@etsalapatis.com> (raw)
In-Reply-To: <20260218225617.354858-2-andrey.grodzovsky@crowdstrike.com>

On Wed Feb 18, 2026 at 5:56 PM EST, Andrey Grodzovsky wrote:
> Add fallback to handle EADDRNOTAVAIL by retrying with absolute
> kernel addresses from kallsyms when symbol names are ambiguous.
>
> When kprobe attachment fails with EADDRNOTAVAIL (or EINVAL in
> legacy mode) and a symbol name was used, the kernel encountered
> duplicate symbols in kallsyms. This patch:
>
> - Adds module_name parameter to kallsyms_cb_t callback and
>   parser to distinguish vmlinux from module symbols
> - Implements find_kaddr_cb() to look up symbol addresses in
>   kallsyms, rejecting ambiguous symbol names
> - Modifies add_kprobe_event_legacy() to support absolute
>   address format
> - Implements retry logic in attach_kprobe_opts() that falls
>   back to address-based attachment on failure
> - Updates avail_kallsyms_cb() callback signature
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com>
> ---
>  tools/lib/bpf/libbpf.c | 150 ++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 133 insertions(+), 17 deletions(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 0c8bf0b5cce4..684781d25859 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -8449,11 +8449,12 @@ static int bpf_object__sanitize_maps(struct bpf_object *obj)
>  }
>  
>  typedef int (*kallsyms_cb_t)(unsigned long long sym_addr, char sym_type,
> -			     const char *sym_name, void *ctx);
> +			     const char *sym_name, const char *module_name, void *ctx);
>  
>  static int libbpf_kallsyms_parse(kallsyms_cb_t cb, void *ctx)
>  {
>  	char sym_type, sym_name[500];
> +	char module_name[128];

Why 128? __MODULE_NAME_LEN is defined as (64 - sizeof(unsigned long)) in
moduleparam.h.

>  	unsigned long long sym_addr;
>  	int ret, err = 0;
>  	FILE *f;
> @@ -8465,18 +8466,39 @@ static int libbpf_kallsyms_parse(kallsyms_cb_t cb, void *ctx)
>  		return err;
>  	}
>  
> -	while (true) {
> -		ret = fscanf(f, "%llx %c %499s%*[^\n]\n",
> -			     &sym_addr, &sym_type, sym_name);
> -		if (ret == EOF && feof(f))
> +	while (!feof(f)) {
> +		/* Position indicator - will be updated by %n if format fully matches */
> +		int n = 0;
> +		/*
> +		 * The module name and symbol name are both without whitespace,
> +		 * but we cannot use %s to capture, as it consumes the closing ']'
> +		 * of the module format. Hence the %127[^] \t\n\r\v\f] which provides
> +		 * the equivalent effect.
> +		 */
> +		ret = fscanf(f, "%llx %c %499s [%127[^] \t\n\r\v\f]] %n",
> +			     &sym_addr, &sym_type, sym_name, module_name, &n);
> +


The following:

> +		if (ret == 4 && n > 0) {
> +			/*
> +			 * Module symbol: all 4 fields matched AND we reached %n.
> +			 * n > 0 means we successfully parsed up to the closing ']'.
> +			 */
> +			err = cb(sym_addr, sym_type, sym_name, module_name, ctx);
> +		} else if (ret == 3) {
> +			/*
> +			 * vmlinux symbol: 3 fields matched, next is matching failure against
> +			 * '[', but we don't care as we got what we needed. Also note that the
> +			 * trailing newline was consumed by the space after the symbol name.
> +			 */
> +			err = cb(sym_addr, sym_type, sym_name, NULL, ctx);
> +		} else if (ret == EOF && feof(f)) {
>  			break;
> -		if (ret != 3) {
> -			pr_warn("failed to read kallsyms entry: %d\n", ret);
> +		} else {
> +			pr_warn("failed to read kallsyms entry: ret=%d n=%d\n", ret, n);
>  			err = -EINVAL;
>  			break;
>  		}
>  
> -		err = cb(sym_addr, sym_type, sym_name, ctx);
>  		if (err)
>  			break;
>  	}


is pretty difficult to follow. Can avoid the long if-else chain
immediately after the fscanf? Maybe something like the following:

char *name;

while (
	ret = fscanf(...);
	if (ret != 3 && ret != 4)
		break;

	if (ret == 4 && n > 0)
		name = module_name;
	else
		name = NULL;

	err = cb(..., name, ctx);
	if (err)
		...
}

if (!feof(f)) {
	pr_warn(...)
	ret = EINVAL;
}

It's also closer to the original code.

> @@ -8486,7 +8508,7 @@ static int libbpf_kallsyms_parse(kallsyms_cb_t cb, void *ctx)
>  }
>  
>  static int kallsyms_cb(unsigned long long sym_addr, char sym_type,
> -		       const char *sym_name, void *ctx)
> +		       const char *sym_name, const char *module_name, void *ctx)
>  {

Is the new argument really needed here? It seems like both for kallsyms_cb
and avail_kallsyms_cb this change is actually there for find_kaddr_cb.
But find_kaddr_cb doesn't even use the module name, it checks if it's 0.
If not then it skips the function.

We can remove the extra argument from all the callback signatures, and
instead just change libbpf_kallsyms parse with an extra parameter:

static int libbpf_kallsyms_parse(kallsyms_cb_t cb, void *ctx, bool
skip_if_module)

or something like that, and skip calling the callback when the flag is
set if module_name != NULL.

>  	struct bpf_object *obj = ctx;
>  	const struct btf_type *t;
> @@ -11518,10 +11540,20 @@ static void gen_probe_legacy_event_name(char *buf, size_t buf_sz,
>  static int add_kprobe_event_legacy(const char *probe_name, bool retprobe,
>  				   const char *kfunc_name, size_t offset)
>  {
> -	return append_to_file(tracefs_kprobe_events(), "%c:%s/%s %s+0x%zx",
> -			      retprobe ? 'r' : 'p',
> -			      retprobe ? "kretprobes" : "kprobes",
> -			      probe_name, kfunc_name, offset);
> +	/* When kfunc_name is NULL, use absolute address format (0xADDR),
> +	 * otherwise use symbol+offset format (SYMBOL+0xOFF)
> +	 */
> +	if (kfunc_name) {
> +		return append_to_file(tracefs_kprobe_events(), "%c:%s/%s %s+0x%zx",
> +				      retprobe ? 'r' : 'p',
> +				      retprobe ? "kretprobes" : "kprobes",
> +				      probe_name, kfunc_name, offset);
> +	} else {
> +		return append_to_file(tracefs_kprobe_events(), "%c:%s/%s 0x%zx",
> +				      retprobe ? 'r' : 'p',
> +				      retprobe ? "kretprobes" : "kprobes",
> +				      probe_name, offset);
> +	}
>  }
>  
>  static int remove_kprobe_event_legacy(const char *probe_name, bool retprobe)
> @@ -11642,6 +11674,47 @@ int probe_kern_syscall_wrapper(int token_fd)
>  	}
>  }
>  
> +/* Context structure for finding vmlinux kernel symbol address */
> +struct find_kaddr_ctx {
> +	const char *func_name;
> +	unsigned long long kaddr;
> +};
> +
> +/* Callback to find vmlinux kernel text symbol address, skipping module symbols */
> +static int find_kaddr_cb(unsigned long long sym_addr, char sym_type,
> +			 const char *sym_name, const char *module_name, void *ctx)
> +{
> +	struct find_kaddr_ctx *data = ctx;
> +
> +	/* Skip module symbols - we only want vmlinux symbols */
> +	if (module_name != NULL)
> +		return 0;
> +
> +	/* Only match text section symbols ('T' or 't') */
> +	if (sym_type != 'T' && sym_type != 't')

nit: toupper(symtype) != 'T' ?

> +		return 0;
> +
> +	/* Check if this is the symbol we're looking for */
> +	if (strcmp(sym_name, data->func_name) == 0) {
> +
> +		/* If we already have an address, we've encountered a
> +		 * duplicate symbol name. Reject such symbols to avoid
> +		 * ambiguous resolution.
> +		 */
> +		if (data->kaddr && data->kaddr != sym_addr) {
> +			pr_warn("kernel symbol '%s': resolution is ambiguous: 0x%llx or 0x%llx\n",
> +			sym_name, data->kaddr, sym_addr);
> +			return -EINVAL;
> +		}
> +
> +		data->kaddr = sym_addr;
> +		pr_debug("found kernel symbol %s at address 0x%llx\n",
> +			 sym_name, data->kaddr);
> +	}
> +
> +	return 0;
> +}
> +
>  struct bpf_link *
>  bpf_program__attach_kprobe_opts(const struct bpf_program *prog,
>  				const char *func_name,
> @@ -11654,6 +11727,8 @@ bpf_program__attach_kprobe_opts(const struct bpf_program *prog,
>  	size_t offset;
>  	bool retprobe, legacy;
>  	int pfd, err;
> +	const char *optional_func_name = func_name;
> +	struct find_kaddr_ctx kaddr_ctx = { .kaddr = 0 };
>  
>  	if (!OPTS_VALID(opts, bpf_kprobe_opts))
>  		return libbpf_err_ptr(-EINVAL);
> @@ -11684,21 +11759,22 @@ bpf_program__attach_kprobe_opts(const struct bpf_program *prog,
>  		return libbpf_err_ptr(-EINVAL);
>  	}
>  
> +retry_create:
>  	if (!legacy) {
>  		pfd = perf_event_open_probe(false /* uprobe */, retprobe,
> -					    func_name, offset,
> +					    optional_func_name, offset,
>  					    -1 /* pid */, 0 /* ref_ctr_off */);
>  	} else {
>  		char probe_name[MAX_EVENT_NAME_LEN];
>  
>  		gen_probe_legacy_event_name(probe_name, sizeof(probe_name),
> -					    func_name, offset);
> +					    optional_func_name, offset);
>  
>  		legacy_probe = strdup(probe_name);
>  		if (!legacy_probe)
>  			return libbpf_err_ptr(-ENOMEM);
>  
> -		pfd = perf_event_kprobe_open_legacy(legacy_probe, retprobe, func_name,
> +		pfd = perf_event_kprobe_open_legacy(legacy_probe, retprobe, optional_func_name,
>  						    offset, -1 /* pid */);
>  	}
>  	if (pfd < 0) {
> @@ -11707,6 +11783,46 @@ bpf_program__attach_kprobe_opts(const struct bpf_program *prog,
>  			prog->name, retprobe ? "kretprobe" : "kprobe",
>  			func_name, offset,
>  			errstr(err));
> +
> +		/*
> +		 * If attachment fails with EADDRNOTAVAIL (or EINVAL in
> +		 * legacy mode) and we used non-NULL func_name, try to
> +		 * find function address in /proc/kallsyms and retry
> +		 * using KADDR instead of ambiguous symbol. Legacy
> +		 * tracefs API converts EADDRNOTAVAIL to EINVAL, so
> +		 * we need to check for both.
> +		 */
> +		if ((err == -EADDRNOTAVAIL || (legacy && err == -EINVAL)) && optional_func_name) {
> +			pr_debug("kallsyms lookup for %s\n",
> +				 func_name);
> +
> +			kaddr_ctx.func_name = func_name;
> +			kaddr_ctx.kaddr = 0;
> +
> +			/*
> +			 * Drop the function symbol and override the
> +			 * offset to be the desired function KADDR.
> +			 * This avoids kallsyms validation for duplicate
> +			 * symbols in the kernel. See attr.config2 in
> +			 * perf_event_open_probe and kernel code in
> +			 * perf_kprobe_init/create_local_trace_kprobe.
> +			 */
> +			optional_func_name = NULL;

AFAICT this will cause NULL to be passed to snprintf in
gen_probe_legacy_event_name.

> +
> +			err = libbpf_kallsyms_parse(find_kaddr_cb, &kaddr_ctx);
> +			if (err) {
> +				pr_warn("failed to get addr for %s\n",
> +					func_name);
> +				goto err_out;
> +			}
> +
> +			pr_debug("retrying %s using kaddr 0x%llx\n",
> +				 func_name, kaddr_ctx.kaddr);
> +
> +			offset += kaddr_ctx.kaddr;
> +			goto retry_create;
> +		}
> +
>  		goto err_out;
>  	}
>  	link = bpf_program__attach_perf_event_opts(prog, pfd, &pe_opts);
> @@ -11822,7 +11938,7 @@ static int avail_func_cmp(const void *a, const void *b)
>  }
>  
>  static int avail_kallsyms_cb(unsigned long long sym_addr, char sym_type,
> -			     const char *sym_name, void *ctx)
> +			     const char *sym_name, const char *module_name, void *ctx)
>  {
>  	struct avail_kallsyms_data *data = ctx;
>  	struct kprobe_multi_resolve *res = data->res;


  parent reply	other threads:[~2026-02-19  0:28 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-18 22:56 [RFC PATCH bpf-next 0/2] libbpf: Handle duplicate kprobe symbols Andrey Grodzovsky
2026-02-18 22:56 ` [RFC PATCH bpf-next 1/2] libbpf: Handle duplicate kprobe symbols in attach_kprobe_opts Andrey Grodzovsky
2026-02-18 23:31   ` bot+bpf-ci
2026-02-19  0:28   ` Emil Tsalapatis [this message]
2026-02-19 21:57     ` [External] " Andrey Grodzovsky
2026-02-18 22:56 ` [RFC PATCH bpf-next 2/2] selftests/bpf: Add tests for duplicate kprobe symbol handling Andrey Grodzovsky
2026-02-23  9:19   ` Jiri Olsa
2026-02-23 18:56     ` [External] " Andrey Grodzovsky
2026-02-24 10:54       ` Jiri Olsa
2026-02-24 13:06         ` Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DGIIHMD5LE8T.ADJM2OXAWCGZ@etsalapatis.com \
    --to=emil@etsalapatis.com \
    --cc=andrey.grodzovsky@crowdstrike.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jolsa@kernel.org \
    --cc=linux-open-source@crowdstrike.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox