linux-kbuild.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] Fix 'faddr2line' for LLVM arm64 builds
@ 2023-07-28 11:34 Will Deacon
  2023-07-28 11:34 ` [PATCH v3 1/4] scripts/faddr2line: Use LLVM addr2line and readelf if LLVM=1 Will Deacon
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Will Deacon @ 2023-07-28 11:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, Will Deacon, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Josh Poimboeuf, John Stultz,
	linux-kbuild

Hi all,

Here's version three of my faddr2line fixes previously posted here:

v1: https://lore.kernel.org/r/20230724174517.15736-1-will@kernel.org
v2: https://lore.kernel.org/r/20230725211157.17031-1-will@kernel.org

Changes since v2 include:
  * Brought back the (unchanged) patch introducing LLVM=1 so that all
    the patches are in one place and don't end up conflicting with each
    other.
  * Added a new patch to drop the strict "FUNC" symbol type match in
    the outer loop

Cheers,

Will

Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: John Stultz <jstultz@google.com>
Cc: linux-kbuild@vger.kernel.org

--->8

Will Deacon (4):
  scripts/faddr2line: Use LLVM addr2line and readelf if LLVM=1
  scripts/mksysmap: Factor out sed ignored symbols expression into
    script
  scripts/faddr2line: Constrain readelf output to symbols from
    System.map
  scripts/faddr2line: Don't filter out non-function symbols from readelf

 scripts/faddr2line              | 15 +++++--
 scripts/mksysmap                | 77 +--------------------------------
 scripts/sysmap-ignored-syms.sed | 74 +++++++++++++++++++++++++++++++
 3 files changed, 86 insertions(+), 80 deletions(-)
 create mode 100644 scripts/sysmap-ignored-syms.sed

-- 
2.41.0.487.g6d72f3e995-goog


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/4] scripts/faddr2line: Use LLVM addr2line and readelf if LLVM=1
  2023-07-28 11:34 [PATCH v3 0/4] Fix 'faddr2line' for LLVM arm64 builds Will Deacon
@ 2023-07-28 11:34 ` Will Deacon
  2023-07-29 20:47   ` Masahiro Yamada
  2023-07-28 11:34 ` [PATCH v3 2/4] scripts/mksysmap: Factor out sed ignored symbols expression into script Will Deacon
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Will Deacon @ 2023-07-28 11:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, Will Deacon, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Josh Poimboeuf, John Stultz,
	linux-kbuild

GNU utilities cannot necessarily parse objects built by LLVM, which can
result in confusing errors when using 'faddr2line':

$ CROSS_COMPILE=aarch64-linux-gnu- ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
aarch64-linux-gnu-addr2line: vmlinux: unknown type [0x13] section `.relr.dyn'
aarch64-linux-gnu-addr2line: DWARF error: invalid or unhandled FORM value: 0x25
do_one_initcall+0xf4/0x260:
aarch64-linux-gnu-addr2line: vmlinux: unknown type [0x13] section `.relr.dyn'
aarch64-linux-gnu-addr2line: DWARF error: invalid or unhandled FORM value: 0x25
$x.73 at main.c:?

Although this can be worked around by setting CROSS_COMPILE to "llvm=-",
it's cleaner to follow the same syntax as the top-level Makefile and
accept LLVM=1 as an indication to use the llvm- tools.

Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: John Stultz <jstultz@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 scripts/faddr2line | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/scripts/faddr2line b/scripts/faddr2line
index 0e73aca4f908..62a3fa6f6f59 100755
--- a/scripts/faddr2line
+++ b/scripts/faddr2line
@@ -58,8 +58,14 @@ die() {
 	exit 1
 }
 
-READELF="${CROSS_COMPILE:-}readelf"
-ADDR2LINE="${CROSS_COMPILE:-}addr2line"
+if [ "${LLVM:-}" == "1" ]; then
+	UTIL_PREFIX=llvm-
+else
+	UTIL_PREFIX=${CROSS_COMPILE:-}
+fi
+
+READELF="${UTIL_PREFIX}readelf"
+ADDR2LINE="${UTIL_PREFIX}addr2line"
 AWK="awk"
 GREP="grep"
 
-- 
2.41.0.487.g6d72f3e995-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/4] scripts/mksysmap: Factor out sed ignored symbols expression into script
  2023-07-28 11:34 [PATCH v3 0/4] Fix 'faddr2line' for LLVM arm64 builds Will Deacon
  2023-07-28 11:34 ` [PATCH v3 1/4] scripts/faddr2line: Use LLVM addr2line and readelf if LLVM=1 Will Deacon
@ 2023-07-28 11:34 ` Will Deacon
  2023-07-29 18:38   ` Nicolas Schier
  2023-07-28 11:34 ` [PATCH v3 3/4] scripts/faddr2line: Constrain readelf output to symbols from System.map Will Deacon
  2023-07-28 11:34 ` [PATCH v3 4/4] scripts/faddr2line: Don't filter out non-function symbols from readelf Will Deacon
  3 siblings, 1 reply; 13+ messages in thread
From: Will Deacon @ 2023-07-28 11:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, Will Deacon, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Josh Poimboeuf, John Stultz,
	linux-kbuild

To prepare for 'faddr2line' reusing the same ignored symbols list as
'mksysmap', factor out the relevant sed expression into its own script,
removing the double-escapes for '$' symbols as they are no longer
required.

Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: John Stultz <jstultz@google.com>
Cc: linux-kbuild@vger.kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
---
 scripts/mksysmap                | 77 +--------------------------------
 scripts/sysmap-ignored-syms.sed | 74 +++++++++++++++++++++++++++++++
 2 files changed, 75 insertions(+), 76 deletions(-)
 create mode 100644 scripts/sysmap-ignored-syms.sed

diff --git a/scripts/mksysmap b/scripts/mksysmap
index 9ba1c9da0a40..a98b34363258 100755
--- a/scripts/mksysmap
+++ b/scripts/mksysmap
@@ -16,7 +16,7 @@
 # 'W' or 'w'.
 #
 
-${NM} -n ${1} | sed >${2} -e "
+${NM} -n ${1} | sed >${2} -f $(dirname $0)/sysmap-ignored-syms.sed -e "
 # ---------------------------------------------------------------------------
 # Ignored symbol types
 #
@@ -27,81 +27,6 @@ ${NM} -n ${1} | sed >${2} -e "
 # w: local weak symbols
 / [aNUw] /d
 
-# ---------------------------------------------------------------------------
-# Ignored prefixes
-#  (do not forget a space before each pattern)
-
-# local symbols for ARM, MIPS, etc.
-/ \\$/d
-
-# local labels, .LBB, .Ltmpxxx, .L__unnamed_xx, .LASANPC, etc.
-/ \.L/d
-
-# arm64 EFI stub namespace
-/ __efistub_/d
-
-# arm64 local symbols in PIE namespace
-/ __pi_\\$/d
-/ __pi_\.L/d
-
-# arm64 local symbols in non-VHE KVM namespace
-/ __kvm_nvhe_\\$/d
-/ __kvm_nvhe_\.L/d
-
-# arm64 lld
-/ __AArch64ADRPThunk_/d
-
-# arm lld
-/ __ARMV5PILongThunk_/d
-/ __ARMV7PILongThunk_/d
-/ __ThumbV7PILongThunk_/d
-
-# mips lld
-/ __LA25Thunk_/d
-/ __microLA25Thunk_/d
-
-# CFI type identifiers
-/ __kcfi_typeid_/d
-/ __kvm_nvhe___kcfi_typeid_/d
-/ __pi___kcfi_typeid_/d
-
-# CRC from modversions
-/ __crc_/d
-
-# EXPORT_SYMBOL (symbol name)
-/ __kstrtab_/d
-
-# EXPORT_SYMBOL (namespace)
-/ __kstrtabns_/d
-
-# ---------------------------------------------------------------------------
-# Ignored suffixes
-#  (do not forget '$' after each pattern)
-
-# arm
-/_from_arm$/d
-/_from_thumb$/d
-/_veneer$/d
-
-# ---------------------------------------------------------------------------
-# Ignored symbols (exact match)
-#  (do not forget a space before and '$' after each pattern)
-
-# for LoongArch?
-/ L0$/d
-
-# ppc
-/ _SDA_BASE_$/d
-/ _SDA2_BASE_$/d
-
-# ---------------------------------------------------------------------------
-# Ignored patterns
-#  (symbols that contain the pattern are ignored)
-
-# ppc stub
-/\.long_branch\./d
-/\.plt_branch\./d
-
 # ---------------------------------------------------------------------------
 # Ignored kallsyms symbols
 #
diff --git a/scripts/sysmap-ignored-syms.sed b/scripts/sysmap-ignored-syms.sed
new file mode 100644
index 000000000000..14b9eb2c9ed9
--- /dev/null
+++ b/scripts/sysmap-ignored-syms.sed
@@ -0,0 +1,74 @@
+# ---------------------------------------------------------------------------
+# Ignored prefixes
+#  (do not forget a space before each pattern)
+
+# local symbols for ARM, MIPS, etc.
+/ \$/d
+
+# local labels, .LBB, .Ltmpxxx, .L__unnamed_xx, .LASANPC, etc.
+/ \.L/d
+
+# arm64 EFI stub namespace
+/ __efistub_/d
+
+# arm64 local symbols in PIE namespace
+/ __pi_\$/d
+/ __pi_\.L/d
+
+# arm64 local symbols in non-VHE KVM namespace
+/ __kvm_nvhe_\$/d
+/ __kvm_nvhe_\.L/d
+
+# arm64 lld
+/ __AArch64ADRPThunk_/d
+
+# arm lld
+/ __ARMV5PILongThunk_/d
+/ __ARMV7PILongThunk_/d
+/ __ThumbV7PILongThunk_/d
+
+# mips lld
+/ __LA25Thunk_/d
+/ __microLA25Thunk_/d
+
+# CFI type identifiers
+/ __kcfi_typeid_/d
+/ __kvm_nvhe___kcfi_typeid_/d
+/ __pi___kcfi_typeid_/d
+
+# CRC from modversions
+/ __crc_/d
+
+# EXPORT_SYMBOL (symbol name)
+/ __kstrtab_/d
+
+# EXPORT_SYMBOL (namespace)
+/ __kstrtabns_/d
+
+# ---------------------------------------------------------------------------
+# Ignored suffixes
+#  (do not forget '$' after each pattern)
+
+# arm
+/_from_arm$/d
+/_from_thumb$/d
+/_veneer$/d
+
+# ---------------------------------------------------------------------------
+# Ignored symbols (exact match)
+#  (do not forget a space before and '$' after each pattern)
+
+# for LoongArch?
+/ L0$/d
+
+# ppc
+/ _SDA_BASE_$/d
+/ _SDA2_BASE_$/d
+
+# ---------------------------------------------------------------------------
+# Ignored patterns
+#  (symbols that contain the pattern are ignored)
+
+# ppc stub
+/\.long_branch\./d
+/\.plt_branch\./d
-- 
2.41.0.487.g6d72f3e995-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/4] scripts/faddr2line: Constrain readelf output to symbols from System.map
  2023-07-28 11:34 [PATCH v3 0/4] Fix 'faddr2line' for LLVM arm64 builds Will Deacon
  2023-07-28 11:34 ` [PATCH v3 1/4] scripts/faddr2line: Use LLVM addr2line and readelf if LLVM=1 Will Deacon
  2023-07-28 11:34 ` [PATCH v3 2/4] scripts/mksysmap: Factor out sed ignored symbols expression into script Will Deacon
@ 2023-07-28 11:34 ` Will Deacon
  2023-08-01 16:42   ` Nick Desaulniers
  2023-08-02 19:54   ` Masahiro Yamada
  2023-07-28 11:34 ` [PATCH v3 4/4] scripts/faddr2line: Don't filter out non-function symbols from readelf Will Deacon
  3 siblings, 2 replies; 13+ messages in thread
From: Will Deacon @ 2023-07-28 11:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, Will Deacon, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Josh Poimboeuf, John Stultz,
	linux-kbuild

Some symbols emitted in the readelf output but filtered from System.map
can confuse the 'faddr2line' symbol size calculation, resulting in the
erroneous rejection of valid offsets. This is especially prevalent when
building an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions
are prefixed with a 32-bit data value in a '$d.n' section. For example:

447538: ffff800080014b80   548 FUNC    GLOBAL DEFAULT    2 do_one_initcall
   104: ffff800080014c74     0 NOTYPE  LOCAL  DEFAULT    2 $x.73
   106: ffff800080014d30     0 NOTYPE  LOCAL  DEFAULT    2 $x.75
   111: ffff800080014da4     0 NOTYPE  LOCAL  DEFAULT    2 $d.78
   112: ffff800080014da8     0 NOTYPE  LOCAL  DEFAULT    2 $x.79
    36: ffff800080014de0   200 FUNC    LOCAL  DEFAULT    2 run_init_process

Adding a warning to do_one_initcall() results in:

  | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260

Which 'faddr2line' refuses to accept:

$ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224)
no match for do_one_initcall+0xf4/0x260

Filter out entries from readelf using the 'sysmap-ignored-syms.sed'
script used to construct System.map, so that the size of a symbol is
calculated as a delta to the next symbol present in ksymtab.

Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: John Stultz <jstultz@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 scripts/faddr2line | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scripts/faddr2line b/scripts/faddr2line
index 62a3fa6f6f59..da734af90036 100755
--- a/scripts/faddr2line
+++ b/scripts/faddr2line
@@ -64,6 +64,7 @@ else
 	UTIL_PREFIX=${CROSS_COMPILE:-}
 fi
 
+IGNORED_SYMS=$(dirname $0)/sysmap-ignored-syms.sed
 READELF="${UTIL_PREFIX}readelf"
 ADDR2LINE="${UTIL_PREFIX}addr2line"
 AWK="awk"
@@ -185,7 +186,7 @@ __faddr2line() {
 				found=2
 				break
 			fi
-		done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
+		done < <(${READELF} --symbols --wide $objfile | sed -f ${IGNORED_SYMS} -e 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
 
 		if [[ $found = 0 ]]; then
 			warn "can't find symbol: sym_name: $sym_name sym_sec: $sym_sec sym_addr: $sym_addr sym_elf_size: $sym_elf_size"
-- 
2.41.0.487.g6d72f3e995-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 4/4] scripts/faddr2line: Don't filter out non-function symbols from readelf
  2023-07-28 11:34 [PATCH v3 0/4] Fix 'faddr2line' for LLVM arm64 builds Will Deacon
                   ` (2 preceding siblings ...)
  2023-07-28 11:34 ` [PATCH v3 3/4] scripts/faddr2line: Constrain readelf output to symbols from System.map Will Deacon
@ 2023-07-28 11:34 ` Will Deacon
  3 siblings, 0 replies; 13+ messages in thread
From: Will Deacon @ 2023-07-28 11:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, Will Deacon, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Nicolas Schier, Josh Poimboeuf, John Stultz,
	linux-kbuild

As Josh points out in 20230724234734.zy67gm674vl3p3wv@treble:

> Problem is, I think the kernel's symbol printing code prints the
> nearest kallsyms symbol, and there are some valid non-FUNC code
> symbols.  For example, syscall_return_via_sysret.

so we shouldn't be considering only 'FUNC'-type symbols in the output
from readelf.

Drop the function symbol type filtering from the faddr2line outer loop.

Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: John Stultz <jstultz@google.com>
Link: https://lore.kernel.org/r/20230724234734.zy67gm674vl3p3wv@treble
Signed-off-by: Will Deacon <will@kernel.org>
---
 scripts/faddr2line | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/faddr2line b/scripts/faddr2line
index da734af90036..47a010615903 100755
--- a/scripts/faddr2line
+++ b/scripts/faddr2line
@@ -267,7 +267,7 @@ __faddr2line() {
 
 		DONE=1
 
-	done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v fn=$sym_name '$4 == "FUNC" && $8 == fn')
+	done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v fn=$sym_name '$8 == fn')
 }
 
 [[ $# -lt 2 ]] && usage
-- 
2.41.0.487.g6d72f3e995-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 2/4] scripts/mksysmap: Factor out sed ignored symbols expression into script
  2023-07-28 11:34 ` [PATCH v3 2/4] scripts/mksysmap: Factor out sed ignored symbols expression into script Will Deacon
@ 2023-07-29 18:38   ` Nicolas Schier
  0 siblings, 0 replies; 13+ messages in thread
From: Nicolas Schier @ 2023-07-29 18:38 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, kernel-team, Masahiro Yamada, Nathan Chancellor,
	Nick Desaulniers, Josh Poimboeuf, John Stultz, linux-kbuild

[-- Attachment #1: Type: text/plain, Size: 5620 bytes --]

On Fri 28 Jul 2023 12:34:13 GMT, Will Deacon wrote:
> To prepare for 'faddr2line' reusing the same ignored symbols list as
> 'mksysmap', factor out the relevant sed expression into its own script,
> removing the double-escapes for '$' symbols as they are no longer
> required.
> 
> Cc: Masahiro Yamada <masahiroy@kernel.org>
> Cc: Nathan Chancellor <nathan@kernel.org>
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Cc: Nicolas Schier <nicolas@fjasle.eu>
> Cc: Josh Poimboeuf <jpoimboe@kernel.org>
> Cc: John Stultz <jstultz@google.com>
> Cc: linux-kbuild@vger.kernel.org
> Signed-off-by: Will Deacon <will@kernel.org>
> ---

Thanks!

Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>

>  scripts/mksysmap                | 77 +--------------------------------
>  scripts/sysmap-ignored-syms.sed | 74 +++++++++++++++++++++++++++++++
>  2 files changed, 75 insertions(+), 76 deletions(-)
>  create mode 100644 scripts/sysmap-ignored-syms.sed
> 
> diff --git a/scripts/mksysmap b/scripts/mksysmap
> index 9ba1c9da0a40..a98b34363258 100755
> --- a/scripts/mksysmap
> +++ b/scripts/mksysmap
> @@ -16,7 +16,7 @@
>  # 'W' or 'w'.
>  #
>  
> -${NM} -n ${1} | sed >${2} -e "
> +${NM} -n ${1} | sed >${2} -f $(dirname $0)/sysmap-ignored-syms.sed -e "
>  # ---------------------------------------------------------------------------
>  # Ignored symbol types
>  #
> @@ -27,81 +27,6 @@ ${NM} -n ${1} | sed >${2} -e "
>  # w: local weak symbols
>  / [aNUw] /d
>  
> -# ---------------------------------------------------------------------------
> -# Ignored prefixes
> -#  (do not forget a space before each pattern)
> -
> -# local symbols for ARM, MIPS, etc.
> -/ \\$/d
> -
> -# local labels, .LBB, .Ltmpxxx, .L__unnamed_xx, .LASANPC, etc.
> -/ \.L/d
> -
> -# arm64 EFI stub namespace
> -/ __efistub_/d
> -
> -# arm64 local symbols in PIE namespace
> -/ __pi_\\$/d
> -/ __pi_\.L/d
> -
> -# arm64 local symbols in non-VHE KVM namespace
> -/ __kvm_nvhe_\\$/d
> -/ __kvm_nvhe_\.L/d
> -
> -# arm64 lld
> -/ __AArch64ADRPThunk_/d
> -
> -# arm lld
> -/ __ARMV5PILongThunk_/d
> -/ __ARMV7PILongThunk_/d
> -/ __ThumbV7PILongThunk_/d
> -
> -# mips lld
> -/ __LA25Thunk_/d
> -/ __microLA25Thunk_/d
> -
> -# CFI type identifiers
> -/ __kcfi_typeid_/d
> -/ __kvm_nvhe___kcfi_typeid_/d
> -/ __pi___kcfi_typeid_/d
> -
> -# CRC from modversions
> -/ __crc_/d
> -
> -# EXPORT_SYMBOL (symbol name)
> -/ __kstrtab_/d
> -
> -# EXPORT_SYMBOL (namespace)
> -/ __kstrtabns_/d
> -
> -# ---------------------------------------------------------------------------
> -# Ignored suffixes
> -#  (do not forget '$' after each pattern)
> -
> -# arm
> -/_from_arm$/d
> -/_from_thumb$/d
> -/_veneer$/d
> -
> -# ---------------------------------------------------------------------------
> -# Ignored symbols (exact match)
> -#  (do not forget a space before and '$' after each pattern)
> -
> -# for LoongArch?
> -/ L0$/d
> -
> -# ppc
> -/ _SDA_BASE_$/d
> -/ _SDA2_BASE_$/d
> -
> -# ---------------------------------------------------------------------------
> -# Ignored patterns
> -#  (symbols that contain the pattern are ignored)
> -
> -# ppc stub
> -/\.long_branch\./d
> -/\.plt_branch\./d
> -
>  # ---------------------------------------------------------------------------
>  # Ignored kallsyms symbols
>  #
> diff --git a/scripts/sysmap-ignored-syms.sed b/scripts/sysmap-ignored-syms.sed
> new file mode 100644
> index 000000000000..14b9eb2c9ed9
> --- /dev/null
> +++ b/scripts/sysmap-ignored-syms.sed
> @@ -0,0 +1,74 @@
> +# ---------------------------------------------------------------------------
> +# Ignored prefixes
> +#  (do not forget a space before each pattern)
> +
> +# local symbols for ARM, MIPS, etc.
> +/ \$/d
> +
> +# local labels, .LBB, .Ltmpxxx, .L__unnamed_xx, .LASANPC, etc.
> +/ \.L/d
> +
> +# arm64 EFI stub namespace
> +/ __efistub_/d
> +
> +# arm64 local symbols in PIE namespace
> +/ __pi_\$/d
> +/ __pi_\.L/d
> +
> +# arm64 local symbols in non-VHE KVM namespace
> +/ __kvm_nvhe_\$/d
> +/ __kvm_nvhe_\.L/d
> +
> +# arm64 lld
> +/ __AArch64ADRPThunk_/d
> +
> +# arm lld
> +/ __ARMV5PILongThunk_/d
> +/ __ARMV7PILongThunk_/d
> +/ __ThumbV7PILongThunk_/d
> +
> +# mips lld
> +/ __LA25Thunk_/d
> +/ __microLA25Thunk_/d
> +
> +# CFI type identifiers
> +/ __kcfi_typeid_/d
> +/ __kvm_nvhe___kcfi_typeid_/d
> +/ __pi___kcfi_typeid_/d
> +
> +# CRC from modversions
> +/ __crc_/d
> +
> +# EXPORT_SYMBOL (symbol name)
> +/ __kstrtab_/d
> +
> +# EXPORT_SYMBOL (namespace)
> +/ __kstrtabns_/d
> +
> +# ---------------------------------------------------------------------------
> +# Ignored suffixes
> +#  (do not forget '$' after each pattern)
> +
> +# arm
> +/_from_arm$/d
> +/_from_thumb$/d
> +/_veneer$/d
> +
> +# ---------------------------------------------------------------------------
> +# Ignored symbols (exact match)
> +#  (do not forget a space before and '$' after each pattern)
> +
> +# for LoongArch?
> +/ L0$/d
> +
> +# ppc
> +/ _SDA_BASE_$/d
> +/ _SDA2_BASE_$/d
> +
> +# ---------------------------------------------------------------------------
> +# Ignored patterns
> +#  (symbols that contain the pattern are ignored)
> +
> +# ppc stub
> +/\.long_branch\./d
> +/\.plt_branch\./d
> -- 
> 2.41.0.487.g6d72f3e995-goog

-- 
Nicolas Schier
 
epost|xmpp: nicolas@fjasle.eu          irc://oftc.net/nsc
↳ gpg: 18ed 52db e34f 860e e9fb  c82b 7d97 0932 55a0 ce7f
     -- frykten for herren er opphav til kunnskap --

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/4] scripts/faddr2line: Use LLVM addr2line and readelf if LLVM=1
  2023-07-28 11:34 ` [PATCH v3 1/4] scripts/faddr2line: Use LLVM addr2line and readelf if LLVM=1 Will Deacon
@ 2023-07-29 20:47   ` Masahiro Yamada
  2023-08-01 16:50     ` Nick Desaulniers
  0 siblings, 1 reply; 13+ messages in thread
From: Masahiro Yamada @ 2023-07-29 20:47 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, kernel-team, Nathan Chancellor, Nick Desaulniers,
	Nicolas Schier, Josh Poimboeuf, John Stultz, linux-kbuild

On Fri, Jul 28, 2023 at 8:34 PM Will Deacon <will@kernel.org> wrote:
>
> GNU utilities cannot necessarily parse objects built by LLVM, which can
> result in confusing errors when using 'faddr2line':
>
> $ CROSS_COMPILE=aarch64-linux-gnu- ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
> aarch64-linux-gnu-addr2line: vmlinux: unknown type [0x13] section `.relr.dyn'
> aarch64-linux-gnu-addr2line: DWARF error: invalid or unhandled FORM value: 0x25
> do_one_initcall+0xf4/0x260:
> aarch64-linux-gnu-addr2line: vmlinux: unknown type [0x13] section `.relr.dyn'
> aarch64-linux-gnu-addr2line: DWARF error: invalid or unhandled FORM value: 0x25
> $x.73 at main.c:?
>
> Although this can be worked around by setting CROSS_COMPILE to "llvm=-",
> it's cleaner to follow the same syntax as the top-level Makefile and
> accept LLVM=1 as an indication to use the llvm- tools.


Just a note.
The top Makefile accepts not only LLVM=1
but also LLVM=/usr/lib/llvm-16/bin/.
The latter is useful when you want to use
a particular version or a custom one.

Another idea might be to use a generic '${prefix}'
as this is not hooked to the Makefile,
but I do not have a strong opinion.




>
> Cc: Josh Poimboeuf <jpoimboe@kernel.org>
> Cc: John Stultz <jstultz@google.com>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  scripts/faddr2line | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/scripts/faddr2line b/scripts/faddr2line
> index 0e73aca4f908..62a3fa6f6f59 100755
> --- a/scripts/faddr2line
> +++ b/scripts/faddr2line
> @@ -58,8 +58,14 @@ die() {
>         exit 1
>  }
>
> -READELF="${CROSS_COMPILE:-}readelf"
> -ADDR2LINE="${CROSS_COMPILE:-}addr2line"
> +if [ "${LLVM:-}" == "1" ]; then
> +       UTIL_PREFIX=llvm-
> +else
> +       UTIL_PREFIX=${CROSS_COMPILE:-}
> +fi
> +
> +READELF="${UTIL_PREFIX}readelf"
> +ADDR2LINE="${UTIL_PREFIX}addr2line"
>  AWK="awk"
>  GREP="grep"
>
> --
> 2.41.0.487.g6d72f3e995-goog
>


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] scripts/faddr2line: Constrain readelf output to symbols from System.map
  2023-07-28 11:34 ` [PATCH v3 3/4] scripts/faddr2line: Constrain readelf output to symbols from System.map Will Deacon
@ 2023-08-01 16:42   ` Nick Desaulniers
  2023-08-01 17:17     ` Sami Tolvanen
  2023-08-02 19:54   ` Masahiro Yamada
  1 sibling, 1 reply; 13+ messages in thread
From: Nick Desaulniers @ 2023-08-01 16:42 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: linux-kernel, kernel-team, Masahiro Yamada, Nathan Chancellor,
	Nicolas Schier, Josh Poimboeuf, John Stultz, linux-kbuild,
	Will Deacon, Fangrui Song

On Fri, Jul 28, 2023 at 4:34 AM Will Deacon <will@kernel.org> wrote:
>
> Some symbols emitted in the readelf output but filtered from System.map
> can confuse the 'faddr2line' symbol size calculation, resulting in the
> erroneous rejection of valid offsets. This is especially prevalent when
> building an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions
> are prefixed with a 32-bit data value in a '$d.n' section. For example:
>
> 447538: ffff800080014b80   548 FUNC    GLOBAL DEFAULT    2 do_one_initcall
>    104: ffff800080014c74     0 NOTYPE  LOCAL  DEFAULT    2 $x.73
>    106: ffff800080014d30     0 NOTYPE  LOCAL  DEFAULT    2 $x.75
>    111: ffff800080014da4     0 NOTYPE  LOCAL  DEFAULT    2 $d.78
>    112: ffff800080014da8     0 NOTYPE  LOCAL  DEFAULT    2 $x.79
>     36: ffff800080014de0   200 FUNC    LOCAL  DEFAULT    2 run_init_process

Sami,
Should we change the llvm-ir linkage type for these symbols from
`internal` to `private`?
https://llvm.org/docs/LangRef.html#linkage-types

Then they would not appear in the symbol table.

At first, I thought other modules might need to directly reference
this data, but with the local binding, I don't think they can.

>
> Adding a warning to do_one_initcall() results in:
>
>   | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260
>
> Which 'faddr2line' refuses to accept:
>
> $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
> skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224)
> no match for do_one_initcall+0xf4/0x260
>
> Filter out entries from readelf using the 'sysmap-ignored-syms.sed'
> script used to construct System.map, so that the size of a symbol is
> calculated as a delta to the next symbol present in ksymtab.
>
> Cc: Josh Poimboeuf <jpoimboe@kernel.org>
> Cc: John Stultz <jstultz@google.com>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  scripts/faddr2line | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/scripts/faddr2line b/scripts/faddr2line
> index 62a3fa6f6f59..da734af90036 100755
> --- a/scripts/faddr2line
> +++ b/scripts/faddr2line
> @@ -64,6 +64,7 @@ else
>         UTIL_PREFIX=${CROSS_COMPILE:-}
>  fi
>
> +IGNORED_SYMS=$(dirname $0)/sysmap-ignored-syms.sed
>  READELF="${UTIL_PREFIX}readelf"
>  ADDR2LINE="${UTIL_PREFIX}addr2line"
>  AWK="awk"
> @@ -185,7 +186,7 @@ __faddr2line() {
>                                 found=2
>                                 break
>                         fi
> -               done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
> +               done < <(${READELF} --symbols --wide $objfile | sed -f ${IGNORED_SYMS} -e 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
>
>                 if [[ $found = 0 ]]; then
>                         warn "can't find symbol: sym_name: $sym_name sym_sec: $sym_sec sym_addr: $sym_addr sym_elf_size: $sym_elf_size"
> --
> 2.41.0.487.g6d72f3e995-goog
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/4] scripts/faddr2line: Use LLVM addr2line and readelf if LLVM=1
  2023-07-29 20:47   ` Masahiro Yamada
@ 2023-08-01 16:50     ` Nick Desaulniers
  0 siblings, 0 replies; 13+ messages in thread
From: Nick Desaulniers @ 2023-08-01 16:50 UTC (permalink / raw)
  To: Will Deacon, Masahiro Yamada
  Cc: linux-kernel, kernel-team, Nathan Chancellor, Nicolas Schier,
	Josh Poimboeuf, John Stultz, linux-kbuild, clang-built-linux

On Sat, Jul 29, 2023 at 1:48 PM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> On Fri, Jul 28, 2023 at 8:34 PM Will Deacon <will@kernel.org> wrote:
> >
> > GNU utilities cannot necessarily parse objects built by LLVM, which can
> > result in confusing errors when using 'faddr2line':
> >
> > $ CROSS_COMPILE=aarch64-linux-gnu- ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
> > aarch64-linux-gnu-addr2line: vmlinux: unknown type [0x13] section `.relr.dyn'

^ old GNU binutils missing support for RELR relocation format.
https://maskray.me/blog/2021-10-31-relative-relocations-and-relr

> > aarch64-linux-gnu-addr2line: DWARF error: invalid or unhandled FORM value: 0x25

^ old GNU binutils missing support for DWARFv5

I suppose if someone used a new GCC with an old binutils, they could
observe the exact same errors.

> > do_one_initcall+0xf4/0x260:
> > aarch64-linux-gnu-addr2line: vmlinux: unknown type [0x13] section `.relr.dyn'
> > aarch64-linux-gnu-addr2line: DWARF error: invalid or unhandled FORM value: 0x25
> > $x.73 at main.c:?
> >
> > Although this can be worked around by setting CROSS_COMPILE to "llvm=-",
> > it's cleaner to follow the same syntax as the top-level Makefile and
> > accept LLVM=1 as an indication to use the llvm- tools.
>
>
> Just a note.
> The top Makefile accepts not only LLVM=1
> but also LLVM=/usr/lib/llvm-16/bin/.
> The latter is useful when you want to use
> a particular version or a custom one.

Ah, like LLVM_PREFIX/LLVM_SUFFIX a la
commit e9c281928c24 ("kbuild: Make $(LLVM) more flexible")

Then we could have both UTIL_PREFIX+UTIL_SUFFIX.

>
> Another idea might be to use a generic '${prefix}'
> as this is not hooked to the Makefile,
> but I do not have a strong opinion.
>
>
>
>
> >
> > Cc: Josh Poimboeuf <jpoimboe@kernel.org>
> > Cc: John Stultz <jstultz@google.com>
> > Signed-off-by: Will Deacon <will@kernel.org>
> > ---
> >  scripts/faddr2line | 10 ++++++++--
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/scripts/faddr2line b/scripts/faddr2line
> > index 0e73aca4f908..62a3fa6f6f59 100755
> > --- a/scripts/faddr2line
> > +++ b/scripts/faddr2line
> > @@ -58,8 +58,14 @@ die() {
> >         exit 1
> >  }
> >
> > -READELF="${CROSS_COMPILE:-}readelf"
> > -ADDR2LINE="${CROSS_COMPILE:-}addr2line"
> > +if [ "${LLVM:-}" == "1" ]; then
> > +       UTIL_PREFIX=llvm-
> > +else
> > +       UTIL_PREFIX=${CROSS_COMPILE:-}
> > +fi
> > +
> > +READELF="${UTIL_PREFIX}readelf"
> > +ADDR2LINE="${UTIL_PREFIX}addr2line"
> >  AWK="awk"
> >  GREP="grep"
> >
> > --
> > 2.41.0.487.g6d72f3e995-goog
> >
>
>
> --
> Best Regards
> Masahiro Yamada



-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] scripts/faddr2line: Constrain readelf output to symbols from System.map
  2023-08-01 16:42   ` Nick Desaulniers
@ 2023-08-01 17:17     ` Sami Tolvanen
  0 siblings, 0 replies; 13+ messages in thread
From: Sami Tolvanen @ 2023-08-01 17:17 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: linux-kernel, kernel-team, Masahiro Yamada, Nathan Chancellor,
	Nicolas Schier, Josh Poimboeuf, John Stultz, linux-kbuild,
	Will Deacon, Fangrui Song

On Tue, Aug 1, 2023 at 9:42 AM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Fri, Jul 28, 2023 at 4:34 AM Will Deacon <will@kernel.org> wrote:
> >
> > Some symbols emitted in the readelf output but filtered from System.map
> > can confuse the 'faddr2line' symbol size calculation, resulting in the
> > erroneous rejection of valid offsets. This is especially prevalent when
> > building an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions
> > are prefixed with a 32-bit data value in a '$d.n' section. For example:
> >
> > 447538: ffff800080014b80   548 FUNC    GLOBAL DEFAULT    2 do_one_initcall
> >    104: ffff800080014c74     0 NOTYPE  LOCAL  DEFAULT    2 $x.73
> >    106: ffff800080014d30     0 NOTYPE  LOCAL  DEFAULT    2 $x.75
> >    111: ffff800080014da4     0 NOTYPE  LOCAL  DEFAULT    2 $d.78
> >    112: ffff800080014da8     0 NOTYPE  LOCAL  DEFAULT    2 $x.79
> >     36: ffff800080014de0   200 FUNC    LOCAL  DEFAULT    2 run_init_process
>
> Sami,
> Should we change the llvm-ir linkage type for these symbols from
> `internal` to `private`?
> https://llvm.org/docs/LangRef.html#linkage-types
>
> Then they would not appear in the symbol table.
>
> At first, I thought other modules might need to directly reference
> this data, but with the local binding, I don't think they can.

For arm64, we don't explicitly emit symbols for type prefixes (see
AsmPrinter::emitKCFITypeId). These mapping symbols are emitted by
AArch64ELFStreamer to mark data and code regions. According to the
AArch64 ELF specification, "All mapping symbols have type STT_NOTYPE
and binding STB_LOCAL," so I assume changing the linkage type isn't an
option:

https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#mapping-symbols

Sami

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] scripts/faddr2line: Constrain readelf output to symbols from System.map
  2023-07-28 11:34 ` [PATCH v3 3/4] scripts/faddr2line: Constrain readelf output to symbols from System.map Will Deacon
  2023-08-01 16:42   ` Nick Desaulniers
@ 2023-08-02 19:54   ` Masahiro Yamada
  2023-08-04 14:30     ` Will Deacon
  1 sibling, 1 reply; 13+ messages in thread
From: Masahiro Yamada @ 2023-08-02 19:54 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, kernel-team, Nathan Chancellor, Nick Desaulniers,
	Nicolas Schier, Josh Poimboeuf, John Stultz, linux-kbuild

On Fri, Jul 28, 2023 at 8:34 PM Will Deacon <will@kernel.org> wrote:
>
> Some symbols emitted in the readelf output but filtered from System.map
> can confuse the 'faddr2line' symbol size calculation, resulting in the
> erroneous rejection of valid offsets. This is especially prevalent when
> building an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions
> are prefixed with a 32-bit data value in a '$d.n' section. For example:
>
> 447538: ffff800080014b80   548 FUNC    GLOBAL DEFAULT    2 do_one_initcall
>    104: ffff800080014c74     0 NOTYPE  LOCAL  DEFAULT    2 $x.73
>    106: ffff800080014d30     0 NOTYPE  LOCAL  DEFAULT    2 $x.75
>    111: ffff800080014da4     0 NOTYPE  LOCAL  DEFAULT    2 $d.78
>    112: ffff800080014da8     0 NOTYPE  LOCAL  DEFAULT    2 $x.79
>     36: ffff800080014de0   200 FUNC    LOCAL  DEFAULT    2 run_init_process
>
> Adding a warning to do_one_initcall() results in:
>
>   | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260
>
> Which 'faddr2line' refuses to accept:
>
> $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
> skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224)
> no match for do_one_initcall+0xf4/0x260
>
> Filter out entries from readelf using the 'sysmap-ignored-syms.sed'
> script used to construct System.map, so that the size of a symbol is
> calculated as a delta to the next symbol present in ksymtab.


I do not think this patch set is the right approach.

I assume faddr2line is meant to work with both vmlinux
and modules.

A problem is that we have different filtering policies wrt kallsyms.

scripts/mksysmap filters symbols in vmlinux,
while kernel/module/kallsyms.c filters ones in modules.

This patch tries to get aligned with the stacktrace of vmlinux,
but that does not seem optimal to the stacktrace of modules.


I have not checked the details, but I guess
the module kallsyms filters less symbols.

https://github.com/torvalds/linux/blob/v6.5-rc4/kernel/module/kallsyms.c#L288


I prefer filtering symbols in the intersection of vmlinux and modules.

is_mapping_symbol() filters symbols you are addressing.







> Cc: Josh Poimboeuf <jpoimboe@kernel.org>
> Cc: John Stultz <jstultz@google.com>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  scripts/faddr2line | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/scripts/faddr2line b/scripts/faddr2line
> index 62a3fa6f6f59..da734af90036 100755
> --- a/scripts/faddr2line
> +++ b/scripts/faddr2line
> @@ -64,6 +64,7 @@ else
>         UTIL_PREFIX=${CROSS_COMPILE:-}
>  fi
>
> +IGNORED_SYMS=$(dirname $0)/sysmap-ignored-syms.sed
>  READELF="${UTIL_PREFIX}readelf"
>  ADDR2LINE="${UTIL_PREFIX}addr2line"
>  AWK="awk"
> @@ -185,7 +186,7 @@ __faddr2line() {
>                                 found=2
>                                 break
>                         fi
> -               done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
> +               done < <(${READELF} --symbols --wide $objfile | sed -f ${IGNORED_SYMS} -e 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
>
>                 if [[ $found = 0 ]]; then
>                         warn "can't find symbol: sym_name: $sym_name sym_sec: $sym_sec sym_addr: $sym_addr sym_elf_size: $sym_elf_size"
> --
> 2.41.0.487.g6d72f3e995-goog
>


--
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] scripts/faddr2line: Constrain readelf output to symbols from System.map
  2023-08-02 19:54   ` Masahiro Yamada
@ 2023-08-04 14:30     ` Will Deacon
  2023-08-07 20:06       ` Masahiro Yamada
  0 siblings, 1 reply; 13+ messages in thread
From: Will Deacon @ 2023-08-04 14:30 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: linux-kernel, kernel-team, Nathan Chancellor, Nick Desaulniers,
	Nicolas Schier, Josh Poimboeuf, John Stultz, linux-kbuild

On Thu, Aug 03, 2023 at 04:54:37AM +0900, Masahiro Yamada wrote:
> On Fri, Jul 28, 2023 at 8:34 PM Will Deacon <will@kernel.org> wrote:
> >
> > Some symbols emitted in the readelf output but filtered from System.map
> > can confuse the 'faddr2line' symbol size calculation, resulting in the
> > erroneous rejection of valid offsets. This is especially prevalent when
> > building an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions
> > are prefixed with a 32-bit data value in a '$d.n' section. For example:
> >
> > 447538: ffff800080014b80   548 FUNC    GLOBAL DEFAULT    2 do_one_initcall
> >    104: ffff800080014c74     0 NOTYPE  LOCAL  DEFAULT    2 $x.73
> >    106: ffff800080014d30     0 NOTYPE  LOCAL  DEFAULT    2 $x.75
> >    111: ffff800080014da4     0 NOTYPE  LOCAL  DEFAULT    2 $d.78
> >    112: ffff800080014da8     0 NOTYPE  LOCAL  DEFAULT    2 $x.79
> >     36: ffff800080014de0   200 FUNC    LOCAL  DEFAULT    2 run_init_process
> >
> > Adding a warning to do_one_initcall() results in:
> >
> >   | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260
> >
> > Which 'faddr2line' refuses to accept:
> >
> > $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
> > skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224)
> > no match for do_one_initcall+0xf4/0x260
> >
> > Filter out entries from readelf using the 'sysmap-ignored-syms.sed'
> > script used to construct System.map, so that the size of a symbol is
> > calculated as a delta to the next symbol present in ksymtab.
> 
> 
> I do not think this patch set is the right approach.
> 
> I assume faddr2line is meant to work with both vmlinux
> and modules.

Huh, it seems to be busted for modules :/ I get:

 | error: unknown argument '--section=.text'

with llvm and:

 | addr2line: DWARF error: invalid or unhandled FORM value: 0x25

with binutils.

I'll look into this, as I don't think it's related to symbol filtering.

> A problem is that we have different filtering policies wrt kallsyms.
> 
> scripts/mksysmap filters symbols in vmlinux,
> while kernel/module/kallsyms.c filters ones in modules.

I don't understand why we need two different ways of filtering out
symbols, but it appears that the module case only filters out local
labels and mapping symbols, both of which are filtered out of vmlinux
as well. Is that right?

> This patch tries to get aligned with the stacktrace of vmlinux,
> but that does not seem optimal to the stacktrace of modules.
> 
> 
> I have not checked the details, but I guess
> the module kallsyms filters less symbols.
> 
> https://github.com/torvalds/linux/blob/v6.5-rc4/kernel/module/kallsyms.c#L288
> 
> I prefer filtering symbols in the intersection of vmlinux and modules.

I think mksysmap filters out a superset of the symbols which are filtered
for modules, so why is the intersection the right thing to do? That will
mean that faddr2line considers a whole bunch of symbols that aren't in
the ksymtab of vmlinux.

> is_mapping_symbol() filters symbols you are addressing.

That's a C function and faddr2line is a shell script. What exactly do
you want me to do? My first hack just matched on symbols starting with
'$' but I ended up with this after other review feedback.

Josh -- how do you want to proceed here?

Will

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] scripts/faddr2line: Constrain readelf output to symbols from System.map
  2023-08-04 14:30     ` Will Deacon
@ 2023-08-07 20:06       ` Masahiro Yamada
  0 siblings, 0 replies; 13+ messages in thread
From: Masahiro Yamada @ 2023-08-07 20:06 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, kernel-team, Nathan Chancellor, Nick Desaulniers,
	Nicolas Schier, Josh Poimboeuf, John Stultz, linux-kbuild

On Fri, Aug 4, 2023 at 11:30 PM Will Deacon <will@kernel.org> wrote:
>
> On Thu, Aug 03, 2023 at 04:54:37AM +0900, Masahiro Yamada wrote:
> > On Fri, Jul 28, 2023 at 8:34 PM Will Deacon <will@kernel.org> wrote:
> > >
> > > Some symbols emitted in the readelf output but filtered from System.map
> > > can confuse the 'faddr2line' symbol size calculation, resulting in the
> > > erroneous rejection of valid offsets. This is especially prevalent when
> > > building an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions
> > > are prefixed with a 32-bit data value in a '$d.n' section. For example:
> > >
> > > 447538: ffff800080014b80   548 FUNC    GLOBAL DEFAULT    2 do_one_initcall
> > >    104: ffff800080014c74     0 NOTYPE  LOCAL  DEFAULT    2 $x.73
> > >    106: ffff800080014d30     0 NOTYPE  LOCAL  DEFAULT    2 $x.75
> > >    111: ffff800080014da4     0 NOTYPE  LOCAL  DEFAULT    2 $d.78
> > >    112: ffff800080014da8     0 NOTYPE  LOCAL  DEFAULT    2 $x.79
> > >     36: ffff800080014de0   200 FUNC    LOCAL  DEFAULT    2 run_init_process
> > >
> > > Adding a warning to do_one_initcall() results in:
> > >
> > >   | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260
> > >
> > > Which 'faddr2line' refuses to accept:
> > >
> > > $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
> > > skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224)
> > > no match for do_one_initcall+0xf4/0x260
> > >
> > > Filter out entries from readelf using the 'sysmap-ignored-syms.sed'
> > > script used to construct System.map, so that the size of a symbol is
> > > calculated as a delta to the next symbol present in ksymtab.
> >
> >
> > I do not think this patch set is the right approach.
> >
> > I assume faddr2line is meant to work with both vmlinux
> > and modules.
>
> Huh, it seems to be busted for modules :/ I get:
>
>  | error: unknown argument '--section=.text'
>
> with llvm and:
>
>  | addr2line: DWARF error: invalid or unhandled FORM value: 0x25
>
> with binutils.
>
> I'll look into this, as I don't think it's related to symbol filtering.
>
> > A problem is that we have different filtering policies wrt kallsyms.
> >
> > scripts/mksysmap filters symbols in vmlinux,
> > while kernel/module/kallsyms.c filters ones in modules.
>
> I don't understand why we need two different ways of filtering out
> symbols, but it appears that the module case only filters out local
> labels and mapping symbols, both of which are filtered out of vmlinux
> as well. Is that right?


For vmlinux kallsyms, the reason is the annoyance in the build process.
kallsyms requires linking vmlinux three times.
(see scripts/link-vmlinux.sh if you are interested)


[1a] link vmlinux without symbol table
[1b] Generate symbol list from [1a]
[2a] link vmlinux, embedding the temp kallsyms created by [1b]
[2b] Generate symbol list from [2a]
[3a] link vmlinux, embedding the final kallsyms created by [2b]


Our assumption is that [1b] and [2b] have the same number of
symbols, so that [2a] and [3a] can embed the same size kallsyms.


However, the process [2a] inserts the kallsyms table
in the middle of vmlinux.
It may cause the compiler to emit additional symbols
(e.g. ARM veneers) due to the inserted kallsyms.


So, we filter out all compiler-generated symbols so that
[1b] and [2b] have the same number of symbols.


If we do not do it,
we may have more and more kallsyms steps.

Emitted veneers increase kallsyms size
-> Increased kallsyms causes more veneers
-> The new veneers increase kallsyms size
-> Increased kallsyms causes yet more veneers
-> The new veneers increase kallsyms size again
-> (this cycle may continue again and again)






> > This patch tries to get aligned with the stacktrace of vmlinux,
> > but that does not seem optimal to the stacktrace of modules.
> >
> >
> > I have not checked the details, but I guess
> > the module kallsyms filters less symbols.
> >
> > https://github.com/torvalds/linux/blob/v6.5-rc4/kernel/module/kallsyms.c#L288
> >
> > I prefer filtering symbols in the intersection of vmlinux and modules.
>
> I think mksysmap filters out a superset of the symbols which are filtered
> for modules, so why is the intersection the right thing to do? That will
> mean that faddr2line considers a whole bunch of symbols that aren't in
> the ksymtab of vmlinux.



If A is a subset of B,
(A & B) == A.


I meant "the intersection of the filtering rules of vmlinux and modules"
== "the filtering rule of modules".







>
> > is_mapping_symbol() filters symbols you are addressing.
>
> That's a C function and faddr2line is a shell script. What exactly do
> you want me to do? My first hack just matched on symbols starting with
> '$' but I ended up with this after other review feedback.


In linux-next, I see this:

static inline int is_mapping_symbol(const char *str)
{
        if (str[0] == '.' && str[1] == 'L')
                return true;
        if (str[0] == 'L' && str[1] == '0')
                return true;
        return str[0] == '$';
}


It is pretty easy to write a sed script to do the same thing.





Another possibility is to use different filtering rules
between vmlinux and modules.
(but I do not know if this complexity is worthwhile)



[pseudo code]


get_symbol_list () {

    if [ "${is_vmlinux}" = 1 ]; then
            {READELF} --symbols --wide $objfile | sed -f
scripts/sysmap-ignored-syms.sed
    else
            {READELF} --symbols --wide $objfile | sed  'remove ".L", 'L0", "^$"'
    fi
}




>
> Josh -- how do you want to proceed here?
>
> Will



-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-08-07 20:06 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-28 11:34 [PATCH v3 0/4] Fix 'faddr2line' for LLVM arm64 builds Will Deacon
2023-07-28 11:34 ` [PATCH v3 1/4] scripts/faddr2line: Use LLVM addr2line and readelf if LLVM=1 Will Deacon
2023-07-29 20:47   ` Masahiro Yamada
2023-08-01 16:50     ` Nick Desaulniers
2023-07-28 11:34 ` [PATCH v3 2/4] scripts/mksysmap: Factor out sed ignored symbols expression into script Will Deacon
2023-07-29 18:38   ` Nicolas Schier
2023-07-28 11:34 ` [PATCH v3 3/4] scripts/faddr2line: Constrain readelf output to symbols from System.map Will Deacon
2023-08-01 16:42   ` Nick Desaulniers
2023-08-01 17:17     ` Sami Tolvanen
2023-08-02 19:54   ` Masahiro Yamada
2023-08-04 14:30     ` Will Deacon
2023-08-07 20:06       ` Masahiro Yamada
2023-07-28 11:34 ` [PATCH v3 4/4] scripts/faddr2line: Don't filter out non-function symbols from readelf Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).