Linux Perf Users
 help / color / mirror / Atom feed
From: Namhyung Kim <namhyung@kernel.org>
To: Tanushree Shah <tshah@linux.ibm.com>
Cc: acme@kernel.org, jolsa@kernel.org, adrian.hunter@intel.com,
	vmolnaro@redhat.com, mpetlan@redhat.com, tmricht@linux.ibm.com,
	maddy@linux.ibm.com, irogers@google.com,
	linux-perf-users@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	atrajeev@linux.ibm.com, hbathini@linux.ibm.com,
	Tejas.Manhas1@ibm.com, Tanushree.Shah@ibm.com,
	Shivani.Nittor@ibm.com
Subject: Re: [PATCH v3] perf dso: Fix kallsyms DSO detection with fallback logic
Date: Tue, 30 Jun 2026 16:51:25 -0700	[thread overview]
Message-ID: <akRWfQB9lLwq2HsR@google.com> (raw)
In-Reply-To: <20260626161052.1024439-2-tshah@linux.ibm.com>

On Fri, Jun 26, 2026 at 09:40:53PM +0530, Tanushree Shah wrote:
> The current kallsyms detection in dso__is_kallsyms() uses the
> dso_binary_type enum which fixes the issue of kallsyms being cached in
> the build-id cache for out-of-tree modules.
> 
> However, during build-id injection in perf record/inject, dso_binary_type
> has not been explicitly set yet,so dso__binary_type() returns
> DSO_BINARY_TYPE__NOT_FOUND instead of DSO_BINARY_TYPE__KALLSYMS for the
> kernel DSO. The current check then fails to identify it as kallsyms,
> causing build-id symlinks to not be created in ~/.debug/.build-id/ and
> perf archive to fail with "Cannot stat" errors.
> 
> Steps to reproduce the issue:
> 1. rm -rf ~/.debug/.build-id
> 2. perf record sleep 1
> 3. perf archive
> 
> Fix by falling back to matching long_name against the known kallsyms
> strings explicitly when binary_type is not yet set
> (== DSO_BINARY_TYPE__NOT_FOUND). Use strcmp() for exact matching of
> fixed names and strict validation for guest kallsyms with embedded PID
> to prevent path traversal attacks.
> 
> Fixes: ebf0b332732d ("perf dso: fix dso__is_kallsyms() check")
> Signed-off-by: Tanushree Shah <tshah@linux.ibm.com>
> ---
> v2 -> v3: Replace strncmp() prefix matching with strcmp() for fixed
>           kallsyms names and add is_guest_kallsyms_pid_name() to
>           strictly validate guest kallsyms with PID format, preventing
>           path traversal attacks.
> 
> v1 -> v2: Rename DSO__NAME_GUEST_KALLSYMS to DSO__PREFIX_GUEST_KALLSYMS
>           to reflect that it is a prefix, not a full name.
> 
> v1: https://lore.kernel.org/all/20260410071225.708005-2-tshah@linux.ibm.com/
> 
>  tools/perf/util/dso.h | 57 ++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 56 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
> index ede691e9a249..8763e6f65316 100644
> --- a/tools/perf/util/dso.h
> +++ b/tools/perf/util/dso.h
> @@ -9,6 +9,7 @@
>  #include <stdbool.h>
>  #include <stdio.h>
>  #include <linux/bitops.h>
> +#include <string.h>
>  #include "build-id.h"
>  #include "debuginfo.h"
>  #include "mutex.h"
> @@ -20,6 +21,40 @@ struct perf_env;
>  
>  #define DSO__NAME_KALLSYMS	"[kernel.kallsyms]"
>  #define DSO__NAME_KCORE		"[kernel.kcore]"
> +#define DSO__NAME_GUEST_KALLSYMS		"[guest.kernel.kallsyms]"
> +#define DSO__NAME_GUEST_KALLSYMS_PID_PREFIX	"[guest.kernel.kallsyms."
> +
> +/*
> + * Validate names of the form "[guest.kernel.kallsyms.<pid>]", where
> + * <pid> is the PID of the guest VM and varies per guest, so it
> + * cannot be matched with strcmp() against a fixed string.
> + *
> + * Every character after the fixed prefix must be a decimal digit,
> + * with ']' immediately terminating the digit run and nothing
> + * following it. This rules out '/', "..", or any other character
> + * being smuggled into the name.
> + */
> +static inline bool is_guest_kallsyms_pid_name(const char *name)
> +{
> +	const size_t prefix_len = sizeof(DSO__NAME_GUEST_KALLSYMS_PID_PREFIX) - 1;
> +	size_t digits;
> +
> +	if (strncmp(name, DSO__NAME_GUEST_KALLSYMS_PID_PREFIX, prefix_len) != 0)
> +		return false;
> +
> +	digits = strspn(name + prefix_len, "0123456789");
> +	if (digits == 0)
> +		return false;
> +
> +	/* ']' must terminate the digit run, with nothing trailing it */
> +	if (name[prefix_len + digits] != ']')
> +		return false;
> +
> +	if (name[prefix_len + digits + 1] != '\0')
> +		return false;
> +
> +	return true;
> +}
>  
>  /**
>   * enum dso_binary_type - The kind of DSO generally associated with a memory
> @@ -914,8 +949,28 @@ static inline bool dso__is_kcore(const struct dso *dso)
>  static inline bool dso__is_kallsyms(const struct dso *dso)
>  {
>  	enum dso_binary_type bt = dso__binary_type(dso);

I have to check its usage carefully but any chance dso__symtab_type(dso)
instead produces better results?


> +	const char *name;
> +
> +	if (bt == DSO_BINARY_TYPE__KALLSYMS || bt == DSO_BINARY_TYPE__GUEST_KALLSYMS)
> +		return true;
> +
> +	if (bt != DSO_BINARY_TYPE__NOT_FOUND)
> +		return false;
> +
> +	if (!RC_CHK_ACCESS(dso)->kernel)

I think the proper wrapper is dso__kernel().


> +		return false;
> +
> +	name = RC_CHK_ACCESS(dso)->long_name;

And dso__long_name().

Thanks,
Namhyung


> +	if (!name)
> +		return false;
> +
> +	if (!strcmp(name, DSO__NAME_KALLSYMS))
> +		return true;
> +
> +	if (!strcmp(name, DSO__NAME_GUEST_KALLSYMS))
> +		return true;
>  
> -	return bt == DSO_BINARY_TYPE__KALLSYMS || bt == DSO_BINARY_TYPE__GUEST_KALLSYMS;
> +	return is_guest_kallsyms_pid_name(name);
>  }
>  
>  bool dso__is_object_file(const struct dso *dso);
> -- 
> 2.47.3
> 

      parent reply	other threads:[~2026-06-30 23:51 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-26 16:10 [PATCH v3] perf dso: Fix kallsyms DSO detection with fallback logic Tanushree Shah
2026-06-26 16:31 ` sashiko-bot
2026-06-30 12:42   ` Tanushree Shah
2026-06-30 23:51 ` Namhyung Kim [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akRWfQB9lLwq2HsR@google.com \
    --to=namhyung@kernel.org \
    --cc=Shivani.Nittor@ibm.com \
    --cc=Tanushree.Shah@ibm.com \
    --cc=Tejas.Manhas1@ibm.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=atrajeev@linux.ibm.com \
    --cc=hbathini@linux.ibm.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maddy@linux.ibm.com \
    --cc=mpetlan@redhat.com \
    --cc=tmricht@linux.ibm.com \
    --cc=tshah@linux.ibm.com \
    --cc=vmolnaro@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox