* [PATCH V8 4/4] Adjust objdump start/end range per map pgoff parameter
2024-09-09 21:30 [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset Steve Clevenger
@ 2024-09-09 21:29 ` Steve Clevenger
2024-09-10 8:03 ` Leo Yan
2024-09-09 21:29 ` [PATCH V8 3/4] Add map pgoff to python dictionary based on MAPPING_TYPE Steve Clevenger
` (3 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Steve Clevenger @ 2024-09-09 21:29 UTC (permalink / raw)
To: leo.yan, james.clark, mike.leach
Cc: suzuki.poulose, ilkka, coresight, linux-perf-users,
linux-arm-kernel
Extract map_pgoff parameter from the dictionary, and adjust start/end
range passed to objdump based on the value.
A zero start_addr is filtered to prevent output of dso address range
check failures. This script repeatedly sees a zero value passed
in for
start_addr = cpu_data[str(cpu) + 'addr']
These zero values are not a new problem. The start_addr/stop_addr warning
clutters the instruction trace output, hence this change.
Signed-off-by: Steve Clevenger <scclevenger@os.amperecomputing.com>
---
.../perf/scripts/python/arm-cs-trace-disasm.py | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/tools/perf/scripts/python/arm-cs-trace-disasm.py b/tools/perf/scripts/python/arm-cs-trace-disasm.py
index 7aff02d84ffb..0d97fcf51b91 100755
--- a/tools/perf/scripts/python/arm-cs-trace-disasm.py
+++ b/tools/perf/scripts/python/arm-cs-trace-disasm.py
@@ -187,6 +187,10 @@ def process_event(param_dict):
dso_start = get_optional(param_dict, "dso_map_start")
dso_end = get_optional(param_dict, "dso_map_end")
symbol = get_optional(param_dict, "symbol")
+ map_pgoff = get_optional(param_dict, "map_pgoff")
+ # check for valid map offset
+ if (str(map_pgoff) == '[unknown]'):
+ map_pgoff = 0
cpu = sample["cpu"]
ip = sample["ip"]
@@ -243,9 +247,11 @@ def process_event(param_dict):
# Record for previous sample packet
cpu_data[str(cpu) + 'addr'] = addr
- # Handle CS_ETM_TRACE_ON packet if start_addr=0 and stop_addr=4
- if (start_addr == 0 and stop_addr == 4):
- print("CPU%d: CS_ETM_TRACE_ON packet is inserted" % cpu)
+ # Filter out zero start_address. Optionally identify CS_ETM_TRACE_ON packet
+ # if start_addr=0 and stop_addr=4.
+ if (start_addr == 0):
+ if ((stop_addr == 4) and (options.verbose == True)):
+ print("CPU%d: CS_ETM_TRACE_ON packet is inserted" % cpu)
return
if (start_addr < int(dso_start) or start_addr > int(dso_end)):
@@ -262,13 +268,14 @@ def process_event(param_dict):
# vm_start to zero.
if (dso == "[kernel.kallsyms]" or dso_start == 0x400000):
dso_vm_start = 0
+ map_pgoff = 0
else:
dso_vm_start = int(dso_start)
dso_fname = get_dso_file_path(dso, dso_bid)
if path.exists(dso_fname):
- print_disam(dso_fname, dso_vm_start, start_addr, stop_addr)
+ print_disam(dso_fname, dso_vm_start, start_addr + map_pgoff, stop_addr + map_pgoff)
else:
- print("Failed to find dso %s for address range [ 0x%x .. 0x%x ]" % (dso, start_addr, stop_addr))
+ print("Failed to find dso %s for address range [ 0x%x .. 0x%x ]" % (dso, start_addr + map_pgoff, stop_addr + map_pgoff))
print_srccode(comm, param_dict, sample, symbol, dso)
--
2.44.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH V8 3/4] Add map pgoff to python dictionary based on MAPPING_TYPE
2024-09-09 21:30 [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset Steve Clevenger
2024-09-09 21:29 ` [PATCH V8 4/4] Adjust objdump start/end range per map pgoff parameter Steve Clevenger
@ 2024-09-09 21:29 ` Steve Clevenger
2024-09-09 21:30 ` [PATCH V8 2/4] Force MAPPING_TYPE__IDENTIY for PIE Steve Clevenger
` (2 subsequent siblings)
4 siblings, 0 replies; 14+ messages in thread
From: Steve Clevenger @ 2024-09-09 21:29 UTC (permalink / raw)
To: leo.yan, james.clark, mike.leach
Cc: suzuki.poulose, ilkka, coresight, linux-perf-users,
linux-arm-kernel
Add map_pgoff parameter to python dictionary so it can be seen by the
python script, arm-cs-trace-disasm.py. map_pgoff is forced to zero in
the dictionary if file type is MAPPING_TYPE__IDENTITY. Otherwise, the
map_pgoff value is directly added to the dictionary.
Signed-off-by: Steve Clevenger <scclevenger@os.amperecomputing.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
---
.../util/scripting-engines/trace-event-python.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index 6971dd6c231f..7b96504d5406 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -798,7 +798,8 @@ static int set_regs_in_dict(PyObject *dict,
static void set_sym_in_dict(PyObject *dict, struct addr_location *al,
const char *dso_field, const char *dso_bid_field,
const char *dso_map_start, const char *dso_map_end,
- const char *sym_field, const char *symoff_field)
+ const char *sym_field, const char *symoff_field,
+ const char *map_pgoff)
{
char sbuild_id[SBUILD_ID_SIZE];
@@ -814,6 +815,12 @@ static void set_sym_in_dict(PyObject *dict, struct addr_location *al,
PyLong_FromUnsignedLong(map__start(al->map)));
pydict_set_item_string_decref(dict, dso_map_end,
PyLong_FromUnsignedLong(map__end(al->map)));
+ if (al->map->mapping_type == MAPPING_TYPE__DSO)
+ pydict_set_item_string_decref(dict, map_pgoff,
+ PyLong_FromUnsignedLongLong(al->map->pgoff));
+ else
+ pydict_set_item_string_decref(dict, map_pgoff,
+ PyLong_FromUnsignedLongLong(0));
}
if (al->sym) {
pydict_set_item_string_decref(dict, sym_field,
@@ -900,7 +907,7 @@ static PyObject *get_perf_sample_dict(struct perf_sample *sample,
pydict_set_item_string_decref(dict, "comm",
_PyUnicode_FromString(thread__comm_str(al->thread)));
set_sym_in_dict(dict, al, "dso", "dso_bid", "dso_map_start", "dso_map_end",
- "symbol", "symoff");
+ "symbol", "symoff", "map_pgoff");
pydict_set_item_string_decref(dict, "callchain", callchain);
@@ -925,7 +932,7 @@ static PyObject *get_perf_sample_dict(struct perf_sample *sample,
PyBool_FromLong(1));
set_sym_in_dict(dict_sample, addr_al, "addr_dso", "addr_dso_bid",
"addr_dso_map_start", "addr_dso_map_end",
- "addr_symbol", "addr_symoff");
+ "addr_symbol", "addr_symoff", "addr_map_pgoff");
}
if (sample->flags)
--
2.44.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH V8 2/4] Force MAPPING_TYPE__IDENTIY for PIE
2024-09-09 21:30 [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset Steve Clevenger
2024-09-09 21:29 ` [PATCH V8 4/4] Adjust objdump start/end range per map pgoff parameter Steve Clevenger
2024-09-09 21:29 ` [PATCH V8 3/4] Add map pgoff to python dictionary based on MAPPING_TYPE Steve Clevenger
@ 2024-09-09 21:30 ` Steve Clevenger
2024-09-09 21:30 ` [PATCH V8 1/4] Add dso__is_pie call to identify ELF PIE Steve Clevenger
2024-10-08 5:51 ` [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset Namhyung Kim
4 siblings, 0 replies; 14+ messages in thread
From: Steve Clevenger @ 2024-09-09 21:30 UTC (permalink / raw)
To: leo.yan, james.clark, mike.leach
Cc: suzuki.poulose, ilkka, coresight, linux-perf-users,
linux-arm-kernel
Use dso__is_pie() to check whether the DSO file is a Position
Independent Executable (PIE). If PIE, change the MAPPING_TYPE to
MAPPING_TYPE__IDENTITY so a zero map pgoff (text offset) is passed
into the script.
Signed-off-by: Steve Clevenger <scclevenger@os.amperecomputing.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
---
tools/perf/util/map.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index e781c8d56a9a..c846faec177b 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -173,8 +173,8 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
assert(!dso__kernel(dso));
map__init(result, start, start + len, pgoff, dso, prot, flags);
- if (anon || no_dso) {
- map->mapping_type = MAPPING_TYPE__IDENTITY;
+ if (anon || no_dso || dso__is_pie(dso)) {
+ map__set_mapping_type(map, MAPPING_TYPE__IDENTITY);
/*
* Set memory without DSO as loaded. All map__find_*
--
2.44.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH V8 1/4] Add dso__is_pie call to identify ELF PIE
2024-09-09 21:30 [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset Steve Clevenger
` (2 preceding siblings ...)
2024-09-09 21:30 ` [PATCH V8 2/4] Force MAPPING_TYPE__IDENTIY for PIE Steve Clevenger
@ 2024-09-09 21:30 ` Steve Clevenger
2024-10-09 21:13 ` Leo Yan
2024-10-08 5:51 ` [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset Namhyung Kim
4 siblings, 1 reply; 14+ messages in thread
From: Steve Clevenger @ 2024-09-09 21:30 UTC (permalink / raw)
To: leo.yan, james.clark, mike.leach
Cc: suzuki.poulose, ilkka, coresight, linux-perf-users,
linux-arm-kernel
Add dso__is_pie global to read the .dynamic section DT_FLAGS_1 entry for
the DF_1_PIE flag. This identifies position executable code.
Signed-off-by: Steve Clevenger <scclevenger@os.amperecomputing.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
---
tools/perf/util/symbol-elf.c | 61 ++++++++++++++++++++++++++++++++++++
tools/perf/util/symbol.h | 1 +
2 files changed, 62 insertions(+)
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index e398abfd13a0..babe47976922 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -662,6 +662,67 @@ static int dso__synthesize_plt_got_symbols(struct dso *dso, Elf *elf,
return err;
}
+/*
+ * Check dynamic section DT_FLAGS_1 for a Position Independent
+ * Executable (PIE).
+ */
+bool dso__is_pie(struct dso *dso)
+{
+ Elf *elf = NULL;
+ Elf_Scn *scn = NULL;
+ GElf_Ehdr ehdr;
+ GElf_Shdr shdr;
+ bool is_pie = false;
+ char dso_path[PATH_MAX];
+ int fd = -1;
+
+ if (!dso || (elf_version(EV_CURRENT) == EV_NONE))
+ goto exit; // false
+
+ dso__build_id_filename(dso, dso_path, sizeof(dso_path), false);
+
+ fd = open(dso_path, O_RDONLY);
+
+ if (fd < 0) {
+ pr_debug("%s: cannot read cached %s.\n", __func__, dso_path);
+ goto exit; // false
+ }
+
+ elf = elf_begin(fd, ELF_C_READ, NULL);
+ gelf_getehdr(elf, &ehdr);
+
+ if (ehdr.e_type == ET_DYN) {
+ Elf_Data *data;
+ GElf_Dyn *entry;
+ int n_entries = shdr.sh_size / sizeof(GElf_Dyn);
+
+ scn = elf_section_by_name(elf, &ehdr, &shdr, ".dynamic", NULL);
+ if (!scn)
+ goto exit_close; // false
+
+ data = (Elf_Data *) elf_getdata(scn, NULL);
+ if (!data || !data->d_buf)
+ goto exit_close; // false
+
+ // check DT_FLAGS_1
+ for (int i = 0; i < n_entries; i++) {
+ entry = ((GElf_Dyn *) data->d_buf) + i;
+ if (entry->d_tag == DT_FLAGS_1) {
+ if ((entry->d_un.d_val & DF_1_PIE) != 0) {
+ is_pie = true;
+ break;
+ }
+ }
+ } // end for
+ }
+
+exit_close:
+ elf_end(elf);
+ close(fd);
+exit:
+ return is_pie;
+}
+
/*
* We need to check if we have a .dynsym, so that we can handle the
* .plt, synthesizing its symbols, that aren't on the symtabs (be it
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 3fb5d146d9b1..33ea2596ce31 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -127,6 +127,7 @@ void dso__insert_symbol(struct dso *dso,
struct symbol *sym);
void dso__delete_symbol(struct dso *dso,
struct symbol *sym);
+bool dso__is_pie(struct dso *dso);
struct symbol *dso__find_symbol(struct dso *dso, u64 addr);
struct symbol *dso__find_symbol_nocache(struct dso *dso, u64 addr);
--
2.44.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset
@ 2024-09-09 21:30 Steve Clevenger
2024-09-09 21:29 ` [PATCH V8 4/4] Adjust objdump start/end range per map pgoff parameter Steve Clevenger
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Steve Clevenger @ 2024-09-09 21:30 UTC (permalink / raw)
To: leo.yan, james.clark, mike.leach
Cc: suzuki.poulose, ilkka, coresight, linux-perf-users,
linux-arm-kernel
Changes in V8:
- in arm-cs-trace-disasm.py, ensure map_pgoff is not converted to
string.
- Remove map_pgoff integer conversion in dso not found print
message.
Changes in V7:
- In arm-cs-trace-disasm.py, fix print message core dump resulting
from mixed type arithmetic.
- Modify CS_ETM_TRACE_ON filter to filter zero start_addr. The
CS_ETM_TRACE_ON message is changed to print only in verbose mode.
- Removed verbose mode only notification for start_addr/stop_addr
outside of dso address range.
Changes in V6:
- In arm-cs-trace-disasm.py, zero map_pgoff for kernel files. Add
map_pgoff to start/end address for dso not found message.
- Added "Reviewed-by" trailer for patches 1-3 previously reviewed
by Leo Yan in V4 and V5.
Changes in V5:
- In symbol-elf.c, branch to exit_close label if open file.
- In trace_event_python.c, correct indentation. set_sym_in_dict
call parameter "map_pgoff" renamed as "addr_map_pgoff" to
match local naming.
Changes in V4:
- In trace-event-python.c, fixed perf-tools-next merge problem.
Changes in V3:
- Rebased to linux-perf-tools branch.
- Squash symbol-elf.c and symbol.h into same commit.
- In map.c, merge dso__is_pie() call into existing if statement.
- In arm-cs-trace-disasm.py, remove debug artifacts.
Changes in V2:
- In dso__is_pie() (symbol-elf.c), Decrease indentation, add null pointer
checks per Leo Yan review.
- Updated mailing list distribution.
Steve Clevenger (4):
Add dso__is_pie call to identify ELF PIE
Force MAPPING_TYPE__IDENTIY for PIE
Add map pgoff to python dictionary based on MAPPING_TYPE
Adjust objdump start/end range per map pgoff parameter
.../scripts/python/arm-cs-trace-disasm.py | 17 ++++--
tools/perf/util/map.c | 4 +-
.../scripting-engines/trace-event-python.c | 13 +++-
tools/perf/util/symbol-elf.c | 61 +++++++++++++++++++
tools/perf/util/symbol.h | 1 +
5 files changed, 86 insertions(+), 10 deletions(-)
--
2.44.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V8 4/4] Adjust objdump start/end range per map pgoff parameter
2024-09-09 21:29 ` [PATCH V8 4/4] Adjust objdump start/end range per map pgoff parameter Steve Clevenger
@ 2024-09-10 8:03 ` Leo Yan
0 siblings, 0 replies; 14+ messages in thread
From: Leo Yan @ 2024-09-10 8:03 UTC (permalink / raw)
To: Steve Clevenger, james.clark, mike.leach
Cc: suzuki.poulose, ilkka, coresight, linux-perf-users,
linux-arm-kernel
On 9/9/2024 10:29 PM, Steve Clevenger wrote:
>
> Extract map_pgoff parameter from the dictionary, and adjust start/end
> range passed to objdump based on the value.
>
> A zero start_addr is filtered to prevent output of dso address range
> check failures. This script repeatedly sees a zero value passed
> in for
> start_addr = cpu_data[str(cpu) + 'addr']
>
> These zero values are not a new problem. The start_addr/stop_addr warning
> clutters the instruction trace output, hence this change.
>
> Signed-off-by: Steve Clevenger <scclevenger@os.amperecomputing.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset
2024-09-09 21:30 [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset Steve Clevenger
` (3 preceding siblings ...)
2024-09-09 21:30 ` [PATCH V8 1/4] Add dso__is_pie call to identify ELF PIE Steve Clevenger
@ 2024-10-08 5:51 ` Namhyung Kim
2024-10-08 18:24 ` Steve Clevenger
4 siblings, 1 reply; 14+ messages in thread
From: Namhyung Kim @ 2024-10-08 5:51 UTC (permalink / raw)
To: Steve Clevenger
Cc: leo.yan, james.clark, mike.leach, suzuki.poulose, ilkka,
coresight, linux-perf-users, linux-arm-kernel
Hello,
Sorry for the long delay. But can you please explain your problem in
more detail? Showing ELF program (or section) header would be helpful
as well.
Is the problem when the text mapping has non-zero pgoff only? Is the
kernel symbols working correctly? Or is it just the Python script
broken?
Thanks,
Namhyung
On Mon, Sep 09, 2024 at 03:30:02PM -0600, Steve Clevenger wrote:
> Changes in V8:
> - in arm-cs-trace-disasm.py, ensure map_pgoff is not converted to
> string.
> - Remove map_pgoff integer conversion in dso not found print
> message.
>
> Changes in V7:
> - In arm-cs-trace-disasm.py, fix print message core dump resulting
> from mixed type arithmetic.
> - Modify CS_ETM_TRACE_ON filter to filter zero start_addr. The
> CS_ETM_TRACE_ON message is changed to print only in verbose mode.
> - Removed verbose mode only notification for start_addr/stop_addr
> outside of dso address range.
>
> Changes in V6:
> - In arm-cs-trace-disasm.py, zero map_pgoff for kernel files. Add
> map_pgoff to start/end address for dso not found message.
> - Added "Reviewed-by" trailer for patches 1-3 previously reviewed
> by Leo Yan in V4 and V5.
>
> Changes in V5:
> - In symbol-elf.c, branch to exit_close label if open file.
> - In trace_event_python.c, correct indentation. set_sym_in_dict
> call parameter "map_pgoff" renamed as "addr_map_pgoff" to
> match local naming.
>
> Changes in V4:
> - In trace-event-python.c, fixed perf-tools-next merge problem.
>
> Changes in V3:
> - Rebased to linux-perf-tools branch.
> - Squash symbol-elf.c and symbol.h into same commit.
> - In map.c, merge dso__is_pie() call into existing if statement.
> - In arm-cs-trace-disasm.py, remove debug artifacts.
>
> Changes in V2:
> - In dso__is_pie() (symbol-elf.c), Decrease indentation, add null pointer
> checks per Leo Yan review.
> - Updated mailing list distribution.
>
> Steve Clevenger (4):
> Add dso__is_pie call to identify ELF PIE
> Force MAPPING_TYPE__IDENTIY for PIE
> Add map pgoff to python dictionary based on MAPPING_TYPE
> Adjust objdump start/end range per map pgoff parameter
>
> .../scripts/python/arm-cs-trace-disasm.py | 17 ++++--
> tools/perf/util/map.c | 4 +-
> .../scripting-engines/trace-event-python.c | 13 +++-
> tools/perf/util/symbol-elf.c | 61 +++++++++++++++++++
> tools/perf/util/symbol.h | 1 +
> 5 files changed, 86 insertions(+), 10 deletions(-)
>
> --
> 2.44.0
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset
2024-10-08 5:51 ` [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset Namhyung Kim
@ 2024-10-08 18:24 ` Steve Clevenger
2024-10-08 21:00 ` Namhyung Kim
0 siblings, 1 reply; 14+ messages in thread
From: Steve Clevenger @ 2024-10-08 18:24 UTC (permalink / raw)
To: Namhyung Kim
Cc: leo.yan, james.clark, mike.leach, suzuki.poulose, ilkka,
coresight, linux-perf-users, linux-arm-kernel
Hello Namhyung,
Sorry your question is so late. I don't include the ELF headers here,
but the problem can be seen with a perf.data packet dump of user
instruction trace capture. The problem is with the non-zero pgoff. The
arm-cs-trace-disasm.py script was never passed pgoff information to
adjust the start/end disassemble range passed to objdump. This patch
distributes the fix between perf and the arm-cs-trace-disasm.py script.
Here's a brief excerpt from an e-mail I sent to James Clark describing
the patch before I submitted it.
Regards,
Steve C.
-----------------------------------------------------------------
.
.
This e-mail is to document what I know and don’t know about the problem.
The background is Fedora 38 introduced non-zero text offsets into common
shared objects and executables (e.g. libc.so.6, etc.).
If you were to ‘perf report --dump’ the perf.data of a user mode
instruction trace on Fedora 37 and grep the PERF_RECORD_MMAP2 packets
you’ll notice all zero values (@ 0) for shared binary and executable
text offsets. Repeat the same for user trace collected on Fedora 38/39,
and these text offsets show as non-zero.
Fedora 37:
4294967295 18446744073709551615 0x8ac8 [0x78]: PERF_RECORD_MMAP2
2389577/2389577: [0xffff85124000(0x42000) @ 0 103:04 805306555
1941233998]: r-xp /usr/lib/ld-linux-aarch64.so.1
4294967295 18446744073709551615 0x8b40 [0x60]: PERF_RECORD_MMAP2
2389577/2389577: [0xffff85161000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]
4294967295 18446744073709551615 0x8ba0 [0x68]: PERF_RECORD_MMAP2
2389577/2389577: [0xaaaab09b0000(0x10000) @ 0 103:04 1140851919
994218038]: r-xp /usr/bin/dd
4294967295 18446744073709551615 0x8c08 [0x70]: PERF_RECORD_MMAP2
2389577/2389577: [0xffff84f70000(0x1a9000) @ 0 103:04 1677721733
3891973508]: r-xp /usr/lib64/libc.so.6
Fedora 39:
4294967295 18446744073709551615 0x8ac8 [0x78]: PERF_RECORD_MMAP2
18229/18229: [0xffffa5512000(0x1d000) @ 0x10000 103:04 161 4093340249]:
r-xp /usr/lib/ld-linux-aarch64.so.1
4294967295 18446744073709551615 0x8b40 [0x60]: PERF_RECORD_MMAP2
18229/18229: [0xffffa554f000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]
4294967295 18446744073709551615 0x8ba0 [0x68]: PERF_RECORD_MMAP2
18229/18229: [0xaaaade7e0000(0xb000) @ 0x10000 103:04 536876423
421616320]: r-xp /usr/bin/dd
4294967295 18446744073709551615 0x8c08 [0x70]: PERF_RECORD_MMAP2
18229/18229: [0xffffa5360000(0x11f000) @ 0x30000 103:04 536873199
2415801053]: r-xp /usr/lib64/libc.so.6
The arm-cs-trace-disasm.py script never gets to see the text offset into
the dso binaries, so has no opportunity to adjust the start/end address
range passed to objdump. This wasn’t a problem on Fedora 37 and below
since there’s no start/end adjustment for a zero text offset. On Fedora
38/39 distros, the instruction trace shows unconditional branch
instructions which do not branch to the target address, the clearest
indication of trouble.
-----------------------------------------------------------------
On 10/7/2024 10:51 PM, Namhyung Kim wrote:
> Hello,
>
> Sorry for the long delay. But can you please explain your problem in
> more detail? Showing ELF program (or section) header would be helpful
> as well.
>
> Is the problem when the text mapping has non-zero pgoff only? Is the
> kernel symbols working correctly? Or is it just the Python script
> broken?
>
> Thanks,
> Namhyung
>
> On Mon, Sep 09, 2024 at 03:30:02PM -0600, Steve Clevenger wrote:
>> Changes in V8:
>> - in arm-cs-trace-disasm.py, ensure map_pgoff is not converted to
>> string.
>> - Remove map_pgoff integer conversion in dso not found print
>> message.
>>
>> Changes in V7:
>> - In arm-cs-trace-disasm.py, fix print message core dump resulting
>> from mixed type arithmetic.
>> - Modify CS_ETM_TRACE_ON filter to filter zero start_addr. The
>> CS_ETM_TRACE_ON message is changed to print only in verbose mode.
>> - Removed verbose mode only notification for start_addr/stop_addr
>> outside of dso address range.
>>
>> Changes in V6:
>> - In arm-cs-trace-disasm.py, zero map_pgoff for kernel files. Add
>> map_pgoff to start/end address for dso not found message.
>> - Added "Reviewed-by" trailer for patches 1-3 previously reviewed
>> by Leo Yan in V4 and V5.
>>
>> Changes in V5:
>> - In symbol-elf.c, branch to exit_close label if open file.
>> - In trace_event_python.c, correct indentation. set_sym_in_dict
>> call parameter "map_pgoff" renamed as "addr_map_pgoff" to
>> match local naming.
>>
>> Changes in V4:
>> - In trace-event-python.c, fixed perf-tools-next merge problem.
>>
>> Changes in V3:
>> - Rebased to linux-perf-tools branch.
>> - Squash symbol-elf.c and symbol.h into same commit.
>> - In map.c, merge dso__is_pie() call into existing if statement.
>> - In arm-cs-trace-disasm.py, remove debug artifacts.
>>
>> Changes in V2:
>> - In dso__is_pie() (symbol-elf.c), Decrease indentation, add null pointer
>> checks per Leo Yan review.
>> - Updated mailing list distribution.
>>
>> Steve Clevenger (4):
>> Add dso__is_pie call to identify ELF PIE
>> Force MAPPING_TYPE__IDENTIY for PIE
>> Add map pgoff to python dictionary based on MAPPING_TYPE
>> Adjust objdump start/end range per map pgoff parameter
>>
>> .../scripts/python/arm-cs-trace-disasm.py | 17 ++++--
>> tools/perf/util/map.c | 4 +-
>> .../scripting-engines/trace-event-python.c | 13 +++-
>> tools/perf/util/symbol-elf.c | 61 +++++++++++++++++++
>> tools/perf/util/symbol.h | 1 +
>> 5 files changed, 86 insertions(+), 10 deletions(-)
>>
>> --
>> 2.44.0
>>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset
2024-10-08 18:24 ` Steve Clevenger
@ 2024-10-08 21:00 ` Namhyung Kim
2024-10-08 22:12 ` Steve Clevenger
2024-10-09 15:24 ` Leo Yan
0 siblings, 2 replies; 14+ messages in thread
From: Namhyung Kim @ 2024-10-08 21:00 UTC (permalink / raw)
To: Steve Clevenger
Cc: leo.yan, james.clark, mike.leach, suzuki.poulose, ilkka,
coresight, linux-perf-users, linux-arm-kernel
On Tue, Oct 08, 2024 at 11:24:45AM -0700, Steve Clevenger wrote:
>
> Hello Namhyung,
>
> Sorry your question is so late. I don't include the ELF headers here,
> but the problem can be seen with a perf.data packet dump of user
> instruction trace capture. The problem is with the non-zero pgoff. The
> arm-cs-trace-disasm.py script was never passed pgoff information to
> adjust the start/end disassemble range passed to objdump. This patch
> distributes the fix between perf and the arm-cs-trace-disasm.py script.
>
> Here's a brief excerpt from an e-mail I sent to James Clark describing
> the patch before I submitted it.
Oh, is it applied to the coresight tree already? I don't think I got
any notification for that. Please make sure to notify perf maintainers
(and hopefullly to get Ack's) when you touch non-coresight part of the
code.
Also I think we need to clarify how to route coresight tool patches. I
thought they are going through the perf-tools tree. But it doesn't seem
to be the case?
Anyway, sorry about being late on this.
>
> -----------------------------------------------------------------
> .
> .
> This e-mail is to document what I know and don’t know about the problem.
> The background is Fedora 38 introduced non-zero text offsets into common
> shared objects and executables (e.g. libc.so.6, etc.).
>
> If you were to ‘perf report --dump’ the perf.data of a user mode
> instruction trace on Fedora 37 and grep the PERF_RECORD_MMAP2 packets
> you’ll notice all zero values (@ 0) for shared binary and executable
> text offsets. Repeat the same for user trace collected on Fedora 38/39,
> and these text offsets show as non-zero.
>
> Fedora 37:
>
> 4294967295 18446744073709551615 0x8ac8 [0x78]: PERF_RECORD_MMAP2
> 2389577/2389577: [0xffff85124000(0x42000) @ 0 103:04 805306555
> 1941233998]: r-xp /usr/lib/ld-linux-aarch64.so.1
> 4294967295 18446744073709551615 0x8b40 [0x60]: PERF_RECORD_MMAP2
> 2389577/2389577: [0xffff85161000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]
> 4294967295 18446744073709551615 0x8ba0 [0x68]: PERF_RECORD_MMAP2
> 2389577/2389577: [0xaaaab09b0000(0x10000) @ 0 103:04 1140851919
> 994218038]: r-xp /usr/bin/dd
> 4294967295 18446744073709551615 0x8c08 [0x70]: PERF_RECORD_MMAP2
> 2389577/2389577: [0xffff84f70000(0x1a9000) @ 0 103:04 1677721733
> 3891973508]: r-xp /usr/lib64/libc.so.6
>
> Fedora 39:
>
> 4294967295 18446744073709551615 0x8ac8 [0x78]: PERF_RECORD_MMAP2
> 18229/18229: [0xffffa5512000(0x1d000) @ 0x10000 103:04 161 4093340249]:
> r-xp /usr/lib/ld-linux-aarch64.so.1
> 4294967295 18446744073709551615 0x8b40 [0x60]: PERF_RECORD_MMAP2
> 18229/18229: [0xffffa554f000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]
> 4294967295 18446744073709551615 0x8ba0 [0x68]: PERF_RECORD_MMAP2
> 18229/18229: [0xaaaade7e0000(0xb000) @ 0x10000 103:04 536876423
> 421616320]: r-xp /usr/bin/dd
> 4294967295 18446744073709551615 0x8c08 [0x70]: PERF_RECORD_MMAP2
> 18229/18229: [0xffffa5360000(0x11f000) @ 0x30000 103:04 536873199
> 2415801053]: r-xp /usr/lib64/libc.so.6
>
> The arm-cs-trace-disasm.py script never gets to see the text offset into
> the dso binaries, so has no opportunity to adjust the start/end address
> range passed to objdump. This wasn’t a problem on Fedora 37 and below
> since there’s no start/end adjustment for a zero text offset. On Fedora
> 38/39 distros, the instruction trace shows unconditional branch
> instructions which do not branch to the target address, the clearest
> indication of trouble.
Ok, thanks for sharing this. I'm ok with exposing pgoff to the python
script itself. But I'd like to make sure if it works correctly with
PIEs that have non-zero page offsets in other places (probably ok).
Thanks,
Namhyung
> -----------------------------------------------------------------
>
> On 10/7/2024 10:51 PM, Namhyung Kim wrote:
> > Hello,
> >
> > Sorry for the long delay. But can you please explain your problem in
> > more detail? Showing ELF program (or section) header would be helpful
> > as well.
> >
> > Is the problem when the text mapping has non-zero pgoff only? Is the
> > kernel symbols working correctly? Or is it just the Python script
> > broken?
> >
> > Thanks,
> > Namhyung
> >
> > On Mon, Sep 09, 2024 at 03:30:02PM -0600, Steve Clevenger wrote:
> >> Changes in V8:
> >> - in arm-cs-trace-disasm.py, ensure map_pgoff is not converted to
> >> string.
> >> - Remove map_pgoff integer conversion in dso not found print
> >> message.
> >>
> >> Changes in V7:
> >> - In arm-cs-trace-disasm.py, fix print message core dump resulting
> >> from mixed type arithmetic.
> >> - Modify CS_ETM_TRACE_ON filter to filter zero start_addr. The
> >> CS_ETM_TRACE_ON message is changed to print only in verbose mode.
> >> - Removed verbose mode only notification for start_addr/stop_addr
> >> outside of dso address range.
> >>
> >> Changes in V6:
> >> - In arm-cs-trace-disasm.py, zero map_pgoff for kernel files. Add
> >> map_pgoff to start/end address for dso not found message.
> >> - Added "Reviewed-by" trailer for patches 1-3 previously reviewed
> >> by Leo Yan in V4 and V5.
> >>
> >> Changes in V5:
> >> - In symbol-elf.c, branch to exit_close label if open file.
> >> - In trace_event_python.c, correct indentation. set_sym_in_dict
> >> call parameter "map_pgoff" renamed as "addr_map_pgoff" to
> >> match local naming.
> >>
> >> Changes in V4:
> >> - In trace-event-python.c, fixed perf-tools-next merge problem.
> >>
> >> Changes in V3:
> >> - Rebased to linux-perf-tools branch.
> >> - Squash symbol-elf.c and symbol.h into same commit.
> >> - In map.c, merge dso__is_pie() call into existing if statement.
> >> - In arm-cs-trace-disasm.py, remove debug artifacts.
> >>
> >> Changes in V2:
> >> - In dso__is_pie() (symbol-elf.c), Decrease indentation, add null pointer
> >> checks per Leo Yan review.
> >> - Updated mailing list distribution.
> >>
> >> Steve Clevenger (4):
> >> Add dso__is_pie call to identify ELF PIE
> >> Force MAPPING_TYPE__IDENTIY for PIE
> >> Add map pgoff to python dictionary based on MAPPING_TYPE
> >> Adjust objdump start/end range per map pgoff parameter
> >>
> >> .../scripts/python/arm-cs-trace-disasm.py | 17 ++++--
> >> tools/perf/util/map.c | 4 +-
> >> .../scripting-engines/trace-event-python.c | 13 +++-
> >> tools/perf/util/symbol-elf.c | 61 +++++++++++++++++++
> >> tools/perf/util/symbol.h | 1 +
> >> 5 files changed, 86 insertions(+), 10 deletions(-)
> >>
> >> --
> >> 2.44.0
> >>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset
2024-10-08 21:00 ` Namhyung Kim
@ 2024-10-08 22:12 ` Steve Clevenger
2024-10-09 15:24 ` Leo Yan
1 sibling, 0 replies; 14+ messages in thread
From: Steve Clevenger @ 2024-10-08 22:12 UTC (permalink / raw)
To: Namhyung Kim
Cc: leo.yan, james.clark, mike.leach, suzuki.poulose, ilkka,
coresight, linux-perf-users, linux-arm-kernel
On 10/8/2024 2:00 PM, Namhyung Kim wrote:
> On Tue, Oct 08, 2024 at 11:24:45AM -0700, Steve Clevenger wrote:
>>
>> Hello Namhyung,
>>
>> Sorry your question is so late. I don't include the ELF headers here,
>> but the problem can be seen with a perf.data packet dump of user
>> instruction trace capture. The problem is with the non-zero pgoff. The
>> arm-cs-trace-disasm.py script was never passed pgoff information to
>> adjust the start/end disassemble range passed to objdump. This patch
>> distributes the fix between perf and the arm-cs-trace-disasm.py script.
>>
>> Here's a brief excerpt from an e-mail I sent to James Clark describing
>> the patch before I submitted it.
>
> Oh, is it applied to the coresight tree already? I don't think I got
> any notification for that. Please make sure to notify perf maintainers
> (and hopefullly to get Ack's) when you touch non-coresight part of the
> code.
>
> Also I think we need to clarify how to route coresight tool patches. I
> thought they are going through the perf-tools tree. But it doesn't seem
> to be the case?
>
> Anyway, sorry about being late on this.
FYI. These patch(es) were based on perf-tools-next at revision
6.11.0-rc3. All ack'd by Leo Yan. If they could get merged to
perf-tools, that would be great!
>
>>
>> -----------------------------------------------------------------
>> .
>> .
>> This e-mail is to document what I know and don’t know about the problem.
>> The background is Fedora 38 introduced non-zero text offsets into common
>> shared objects and executables (e.g. libc.so.6, etc.).
>>
>> If you were to ‘perf report --dump’ the perf.data of a user mode
>> instruction trace on Fedora 37 and grep the PERF_RECORD_MMAP2 packets
>> you’ll notice all zero values (@ 0) for shared binary and executable
>> text offsets. Repeat the same for user trace collected on Fedora 38/39,
>> and these text offsets show as non-zero.
>>
>> Fedora 37:
>>
>> 4294967295 18446744073709551615 0x8ac8 [0x78]: PERF_RECORD_MMAP2
>> 2389577/2389577: [0xffff85124000(0x42000) @ 0 103:04 805306555
>> 1941233998]: r-xp /usr/lib/ld-linux-aarch64.so.1
>> 4294967295 18446744073709551615 0x8b40 [0x60]: PERF_RECORD_MMAP2
>> 2389577/2389577: [0xffff85161000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]
>> 4294967295 18446744073709551615 0x8ba0 [0x68]: PERF_RECORD_MMAP2
>> 2389577/2389577: [0xaaaab09b0000(0x10000) @ 0 103:04 1140851919
>> 994218038]: r-xp /usr/bin/dd
>> 4294967295 18446744073709551615 0x8c08 [0x70]: PERF_RECORD_MMAP2
>> 2389577/2389577: [0xffff84f70000(0x1a9000) @ 0 103:04 1677721733
>> 3891973508]: r-xp /usr/lib64/libc.so.6
>>
>> Fedora 39:
>>
>> 4294967295 18446744073709551615 0x8ac8 [0x78]: PERF_RECORD_MMAP2
>> 18229/18229: [0xffffa5512000(0x1d000) @ 0x10000 103:04 161 4093340249]:
>> r-xp /usr/lib/ld-linux-aarch64.so.1
>> 4294967295 18446744073709551615 0x8b40 [0x60]: PERF_RECORD_MMAP2
>> 18229/18229: [0xffffa554f000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]
>> 4294967295 18446744073709551615 0x8ba0 [0x68]: PERF_RECORD_MMAP2
>> 18229/18229: [0xaaaade7e0000(0xb000) @ 0x10000 103:04 536876423
>> 421616320]: r-xp /usr/bin/dd
>> 4294967295 18446744073709551615 0x8c08 [0x70]: PERF_RECORD_MMAP2
>> 18229/18229: [0xffffa5360000(0x11f000) @ 0x30000 103:04 536873199
>> 2415801053]: r-xp /usr/lib64/libc.so.6
>>
>> The arm-cs-trace-disasm.py script never gets to see the text offset into
>> the dso binaries, so has no opportunity to adjust the start/end address
>> range passed to objdump. This wasn’t a problem on Fedora 37 and below
>> since there’s no start/end adjustment for a zero text offset. On Fedora
>> 38/39 distros, the instruction trace shows unconditional branch
>> instructions which do not branch to the target address, the clearest
>> indication of trouble.
>
> Ok, thanks for sharing this. I'm ok with exposing pgoff to the python
> script itself. But I'd like to make sure if it works correctly with
> PIEs that have non-zero page offsets in other places (probably ok).
>
> Thanks,
> Namhyung
>
>
>> -----------------------------------------------------------------
>>
>> On 10/7/2024 10:51 PM, Namhyung Kim wrote:
>>> Hello,
>>>
>>> Sorry for the long delay. But can you please explain your problem in
>>> more detail? Showing ELF program (or section) header would be helpful
>>> as well.
>>>
>>> Is the problem when the text mapping has non-zero pgoff only? Is the
>>> kernel symbols working correctly? Or is it just the Python script
>>> broken?
>>>
>>> Thanks,
>>> Namhyung
>>>
>>> On Mon, Sep 09, 2024 at 03:30:02PM -0600, Steve Clevenger wrote:
>>>> Changes in V8:
>>>> - in arm-cs-trace-disasm.py, ensure map_pgoff is not converted to
>>>> string.
>>>> - Remove map_pgoff integer conversion in dso not found print
>>>> message.
>>>>
>>>> Changes in V7:
>>>> - In arm-cs-trace-disasm.py, fix print message core dump resulting
>>>> from mixed type arithmetic.
>>>> - Modify CS_ETM_TRACE_ON filter to filter zero start_addr. The
>>>> CS_ETM_TRACE_ON message is changed to print only in verbose mode.
>>>> - Removed verbose mode only notification for start_addr/stop_addr
>>>> outside of dso address range.
>>>>
>>>> Changes in V6:
>>>> - In arm-cs-trace-disasm.py, zero map_pgoff for kernel files. Add
>>>> map_pgoff to start/end address for dso not found message.
>>>> - Added "Reviewed-by" trailer for patches 1-3 previously reviewed
>>>> by Leo Yan in V4 and V5.
>>>>
>>>> Changes in V5:
>>>> - In symbol-elf.c, branch to exit_close label if open file.
>>>> - In trace_event_python.c, correct indentation. set_sym_in_dict
>>>> call parameter "map_pgoff" renamed as "addr_map_pgoff" to
>>>> match local naming.
>>>>
>>>> Changes in V4:
>>>> - In trace-event-python.c, fixed perf-tools-next merge problem.
>>>>
>>>> Changes in V3:
>>>> - Rebased to linux-perf-tools branch.
>>>> - Squash symbol-elf.c and symbol.h into same commit.
>>>> - In map.c, merge dso__is_pie() call into existing if statement.
>>>> - In arm-cs-trace-disasm.py, remove debug artifacts.
>>>>
>>>> Changes in V2:
>>>> - In dso__is_pie() (symbol-elf.c), Decrease indentation, add null pointer
>>>> checks per Leo Yan review.
>>>> - Updated mailing list distribution.
>>>>
>>>> Steve Clevenger (4):
>>>> Add dso__is_pie call to identify ELF PIE
>>>> Force MAPPING_TYPE__IDENTIY for PIE
>>>> Add map pgoff to python dictionary based on MAPPING_TYPE
>>>> Adjust objdump start/end range per map pgoff parameter
>>>>
>>>> .../scripts/python/arm-cs-trace-disasm.py | 17 ++++--
>>>> tools/perf/util/map.c | 4 +-
>>>> .../scripting-engines/trace-event-python.c | 13 +++-
>>>> tools/perf/util/symbol-elf.c | 61 +++++++++++++++++++
>>>> tools/perf/util/symbol.h | 1 +
>>>> 5 files changed, 86 insertions(+), 10 deletions(-)
>>>>
>>>> --
>>>> 2.44.0
>>>>
>>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset
2024-10-08 21:00 ` Namhyung Kim
2024-10-08 22:12 ` Steve Clevenger
@ 2024-10-09 15:24 ` Leo Yan
2024-10-10 0:20 ` Namhyung Kim
1 sibling, 1 reply; 14+ messages in thread
From: Leo Yan @ 2024-10-09 15:24 UTC (permalink / raw)
To: Namhyung Kim, Steve Clevenger
Cc: james.clark, mike.leach, suzuki.poulose, ilkka, coresight,
linux-perf-users, linux-arm-kernel
Hi Namhyung,
On 10/8/2024 10:00 PM, Namhyung Kim wrote:
[...]
> On Tue, Oct 08, 2024 at 11:24:45AM -0700, Steve Clevenger wrote:
>>
>> Hello Namhyung,
>>
>> Sorry your question is so late. I don't include the ELF headers here,
>> but the problem can be seen with a perf.data packet dump of user
>> instruction trace capture. The problem is with the non-zero pgoff. The
>> arm-cs-trace-disasm.py script was never passed pgoff information to
>> adjust the start/end disassemble range passed to objdump. This patch
>> distributes the fix between perf and the arm-cs-trace-disasm.py script.
>>
>> Here's a brief excerpt from an e-mail I sent to James Clark describing
>> the patch before I submitted it.
>
> Oh, is it applied to the coresight tree already?
No.
> I don't think I got> any notification for that. Please make sure to notify
perf maintainers
> (and hopefullly to get Ack's) when you touch non-coresight part of the
> code.
>
> Also I think we need to clarify how to route coresight tool patches. I
> thought they are going through the perf-tools tree. But it doesn't seem
> to be the case?
AFAIK, in most cases, Arm patches for the perf will be picked up via
perf-tools tree, including this patch series.
[...]
>> Fedora 39:
>>
>> 4294967295 18446744073709551615 0x8ac8 [0x78]: PERF_RECORD_MMAP2
>> 18229/18229: [0xffffa5512000(0x1d000) @ 0x10000 103:04 161 4093340249]:
>> r-xp /usr/lib/ld-linux-aarch64.so.1
>>
>> 4294967295 18446744073709551615 0x8b40 [0x60]: PERF_RECORD_MMAP2
>> 18229/18229: [0xffffa554f000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]
>> 4294967295 18446744073709551615 0x8ba0 [0x68]: PERF_RECORD_MMAP2
>> 18229/18229: [0xaaaade7e0000(0xb000) @ 0x10000 103:04 536876423
>> 421616320]: r-xp /usr/bin/dd
>> 4294967295 18446744073709551615 0x8c08 [0x70]: PERF_RECORD_MMAP2
>> 18229/18229: [0xffffa5360000(0x11f000) @ 0x30000 103:04 536873199
>> 2415801053]: r-xp /usr/lib64/libc.so.6
>>
>> The arm-cs-trace-disasm.py script never gets to see the text offset into
>> the dso binaries, so has no opportunity to adjust the start/end address
>> range passed to objdump. This wasn’t a problem on Fedora 37 and below
>> since there’s no start/end adjustment for a zero text offset. On Fedora
>> 38/39 distros, the instruction trace shows unconditional branch
>> instructions which do not branch to the target address, the clearest
>> indication of trouble.
>
> Ok, thanks for sharing this. I'm ok with exposing pgoff to the python
> script itself. But I'd like to make sure if it works correctly with
> PIEs that have non-zero page offsets in other places (probably ok).
Fair point. The 'pgoff' is exposed from the kernel's VMA info, it is a runtime
value after the DSO loading. After set the MAPPING_TYPE__IDENTITY type for PIE
DSO, the 'pgoff' will be ignored in both the perf tool (see map__map_ip()) and
the script.
Sorry that I missed this part when reviewed the change. I understood Steve
have verified the script result. Seems to me, the critical question is kernel
has an offset for PIE DSO but why ignore it in the tool.
@Steve, if you have more info, could you explain a bit what is the reason for
ignoring pgoff for PIE DSO?
Thanks,
Leo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V8 1/4] Add dso__is_pie call to identify ELF PIE
2024-09-09 21:30 ` [PATCH V8 1/4] Add dso__is_pie call to identify ELF PIE Steve Clevenger
@ 2024-10-09 21:13 ` Leo Yan
2024-10-10 16:30 ` Steve Clevenger
0 siblings, 1 reply; 14+ messages in thread
From: Leo Yan @ 2024-10-09 21:13 UTC (permalink / raw)
To: Steve Clevenger, james.clark, mike.leach
Cc: suzuki.poulose, ilkka, coresight, linux-perf-users,
linux-arm-kernel
On 9/9/2024 10:30 PM, Steve Clevenger wrote:
>
> Add dso__is_pie global to read the .dynamic section DT_FLAGS_1 entry for
> the DF_1_PIE flag. This identifies position executable code.
>
> Signed-off-by: Steve Clevenger <scclevenger@os.amperecomputing.com>
> Reviewed-by: Leo Yan <leo.yan@arm.com>
> ---
> tools/perf/util/symbol-elf.c | 61 ++++++++++++++++++++++++++++++++++++
> tools/perf/util/symbol.h | 1 +
> 2 files changed, 62 insertions(+)
>
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index e398abfd13a0..babe47976922 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -662,6 +662,67 @@ static int dso__synthesize_plt_got_symbols(struct dso *dso, Elf *elf,
> return err;
> }
>
> +/*
> + * Check dynamic section DT_FLAGS_1 for a Position Independent
> + * Executable (PIE).
> + */
> +bool dso__is_pie(struct dso *dso)
> +{
> + Elf *elf = NULL;
> + Elf_Scn *scn = NULL;
> + GElf_Ehdr ehdr;
> + GElf_Shdr shdr;
> + bool is_pie = false;
> + char dso_path[PATH_MAX];
> + int fd = -1;
> +
> + if (!dso || (elf_version(EV_CURRENT) == EV_NONE))
> + goto exit; // false
> +
> + dso__build_id_filename(dso, dso_path, sizeof(dso_path), false);
> +
> + fd = open(dso_path, O_RDONLY);
> +
> + if (fd < 0) {
> + pr_debug("%s: cannot read cached %s.\n", __func__, dso_path);
> + goto exit; // false
> + }
> +
> + elf = elf_begin(fd, ELF_C_READ, NULL);
> + gelf_getehdr(elf, &ehdr);
> +
> + if (ehdr.e_type == ET_DYN) {
> + Elf_Data *data;
> + GElf_Dyn *entry;
> + int n_entries = shdr.sh_size / sizeof(GElf_Dyn);
I took time to play this series on Arm64 machine and found issue in above
sentence. As the 'shdr' strucutre is read out from elf_section_by_name()
below, but it calculates the entries before reading out the section header.
Therefore, I observed that program will not be detected as PIE executable due
to 'n_entries' is 0.
With fixing this bug, then I observed the regression caused by patch 02.
Below are steps:
- Build test program with below command:
# gcc -pie -Wl,-z,relro,-z,now -o test test.c
# readelf readelf -a test | grep FLAGS
0x000000000000001e (FLAGS) BIND_NOW
0x000000006ffffffb (FLAGS_1) Flags: NOW PIE
The test program is a simple infinite loop. I run this program in a docker
container with Fedora 40.
- Record trace data:
# perf record -e cycles -p 149531
# perf script
test 149531 483168.078485: 4986 cycles: aaaacde401b4
[unknown] (/home/test)
test 149531 483168.078564: 134207 cycles: aaaacde401b4
[unknown] (/home/test)
test 149531 483168.079305: 1257097 cycles: aaaacde401b4
[unknown] (/home/test)
You can see it fails to parse symbol and print out [unknown].
After I reverted patch 02 in this series, then:
# perf script
test 149531 483168.078485: 4986 cycles: aaaacde401b4
main+0xc (/home/test)
test 149531 483168.078564: 134207 cycles: aaaacde401b4
main+0xc (/home/test)
test 149531 483168.079305: 1257097 cycles: aaaacde401b4
main+0xc (/home/test)
Not sure if I miss anything for PIE executable, seems to me, we should drop
the first two patches and just pass pg_off to python script?
Thanks,
Leo
> +
> + scn = elf_section_by_name(elf, &ehdr, &shdr, ".dynamic", NULL);
> + if (!scn)
> + goto exit_close; // false
> +
> + data = (Elf_Data *) elf_getdata(scn, NULL);
> + if (!data || !data->d_buf)
> + goto exit_close; // false
> +
> + // check DT_FLAGS_1
> + for (int i = 0; i < n_entries; i++) {
> + entry = ((GElf_Dyn *) data->d_buf) + i;
> + if (entry->d_tag == DT_FLAGS_1) {
> + if ((entry->d_un.d_val & DF_1_PIE) != 0) {
> + is_pie = true;
> + break;
> + }
> + }
> + } // end for
> + }
> +
> +exit_close:
> + elf_end(elf);
> + close(fd);
> +exit:
> + return is_pie;
> +}
> +
> /*
> * We need to check if we have a .dynsym, so that we can handle the
> * .plt, synthesizing its symbols, that aren't on the symtabs (be it
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index 3fb5d146d9b1..33ea2596ce31 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -127,6 +127,7 @@ void dso__insert_symbol(struct dso *dso,
> struct symbol *sym);
> void dso__delete_symbol(struct dso *dso,
> struct symbol *sym);
> +bool dso__is_pie(struct dso *dso);
>
> struct symbol *dso__find_symbol(struct dso *dso, u64 addr);
> struct symbol *dso__find_symbol_nocache(struct dso *dso, u64 addr);
> --
> 2.44.0
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset
2024-10-09 15:24 ` Leo Yan
@ 2024-10-10 0:20 ` Namhyung Kim
0 siblings, 0 replies; 14+ messages in thread
From: Namhyung Kim @ 2024-10-10 0:20 UTC (permalink / raw)
To: Leo Yan
Cc: Steve Clevenger, james.clark, mike.leach, suzuki.poulose, ilkka,
coresight, linux-perf-users, linux-arm-kernel
On Wed, Oct 09, 2024 at 04:24:29PM +0100, Leo Yan wrote:
> Hi Namhyung,
>
> On 10/8/2024 10:00 PM, Namhyung Kim wrote:
>
> [...]
>
> > On Tue, Oct 08, 2024 at 11:24:45AM -0700, Steve Clevenger wrote:
> >>
> >> Hello Namhyung,
> >>
> >> Sorry your question is so late. I don't include the ELF headers here,
> >> but the problem can be seen with a perf.data packet dump of user
> >> instruction trace capture. The problem is with the non-zero pgoff. The
> >> arm-cs-trace-disasm.py script was never passed pgoff information to
> >> adjust the start/end disassemble range passed to objdump. This patch
> >> distributes the fix between perf and the arm-cs-trace-disasm.py script.
> >>
> >> Here's a brief excerpt from an e-mail I sent to James Clark describing
> >> the patch before I submitted it.
> >
> > Oh, is it applied to the coresight tree already?
>
> No.
>
> > I don't think I got> any notification for that. Please make sure to notify
> perf maintainers
> > (and hopefullly to get Ack's) when you touch non-coresight part of the
> > code.
> >
> > Also I think we need to clarify how to route coresight tool patches. I
> > thought they are going through the perf-tools tree. But it doesn't seem
> > to be the case?
>
> AFAIK, in most cases, Arm patches for the perf will be picked up via
> perf-tools tree, including this patch series.
Ok, I was confused and thought it was applied there.
Thanks,
Namhyung
>
> [...]
> >> Fedora 39:
> >>
> >> 4294967295 18446744073709551615 0x8ac8 [0x78]: PERF_RECORD_MMAP2
> >> 18229/18229: [0xffffa5512000(0x1d000) @ 0x10000 103:04 161 4093340249]:
> >> r-xp /usr/lib/ld-linux-aarch64.so.1
> >>
> >> 4294967295 18446744073709551615 0x8b40 [0x60]: PERF_RECORD_MMAP2
> >> 18229/18229: [0xffffa554f000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]
> >> 4294967295 18446744073709551615 0x8ba0 [0x68]: PERF_RECORD_MMAP2
> >> 18229/18229: [0xaaaade7e0000(0xb000) @ 0x10000 103:04 536876423
> >> 421616320]: r-xp /usr/bin/dd
> >> 4294967295 18446744073709551615 0x8c08 [0x70]: PERF_RECORD_MMAP2
> >> 18229/18229: [0xffffa5360000(0x11f000) @ 0x30000 103:04 536873199
> >> 2415801053]: r-xp /usr/lib64/libc.so.6
> >>
> >> The arm-cs-trace-disasm.py script never gets to see the text offset into
> >> the dso binaries, so has no opportunity to adjust the start/end address
> >> range passed to objdump. This wasn’t a problem on Fedora 37 and below
> >> since there’s no start/end adjustment for a zero text offset. On Fedora
> >> 38/39 distros, the instruction trace shows unconditional branch
> >> instructions which do not branch to the target address, the clearest
> >> indication of trouble.
> >
> > Ok, thanks for sharing this. I'm ok with exposing pgoff to the python
> > script itself. But I'd like to make sure if it works correctly with
> > PIEs that have non-zero page offsets in other places (probably ok).
>
> Fair point. The 'pgoff' is exposed from the kernel's VMA info, it is a runtime
> value after the DSO loading. After set the MAPPING_TYPE__IDENTITY type for PIE
> DSO, the 'pgoff' will be ignored in both the perf tool (see map__map_ip()) and
> the script.
>
> Sorry that I missed this part when reviewed the change. I understood Steve
> have verified the script result. Seems to me, the critical question is kernel
> has an offset for PIE DSO but why ignore it in the tool.
>
> @Steve, if you have more info, could you explain a bit what is the reason for
> ignoring pgoff for PIE DSO?
>
> Thanks,
> Leo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V8 1/4] Add dso__is_pie call to identify ELF PIE
2024-10-09 21:13 ` Leo Yan
@ 2024-10-10 16:30 ` Steve Clevenger
0 siblings, 0 replies; 14+ messages in thread
From: Steve Clevenger @ 2024-10-10 16:30 UTC (permalink / raw)
To: Leo Yan, james.clark, mike.leach
Cc: suzuki.poulose, ilkka, coresight, linux-perf-users,
linux-arm-kernel
On 10/9/2024 2:13 PM, Leo Yan wrote:
> On 9/9/2024 10:30 PM, Steve Clevenger wrote:
>>
>> Add dso__is_pie global to read the .dynamic section DT_FLAGS_1 entry for
>> the DF_1_PIE flag. This identifies position executable code.
>>
>> Signed-off-by: Steve Clevenger <scclevenger@os.amperecomputing.com>
>> Reviewed-by: Leo Yan <leo.yan@arm.com>
>> ---
>> tools/perf/util/symbol-elf.c | 61 ++++++++++++++++++++++++++++++++++++
>> tools/perf/util/symbol.h | 1 +
>> 2 files changed, 62 insertions(+)
>>
>> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
>> index e398abfd13a0..babe47976922 100644
>> --- a/tools/perf/util/symbol-elf.c
>> +++ b/tools/perf/util/symbol-elf.c
>> @@ -662,6 +662,67 @@ static int dso__synthesize_plt_got_symbols(struct dso *dso, Elf *elf,
>> return err;
>> }
>>
>> +/*
>> + * Check dynamic section DT_FLAGS_1 for a Position Independent
>> + * Executable (PIE).
>> + */
>> +bool dso__is_pie(struct dso *dso)
>> +{
>> + Elf *elf = NULL;
>> + Elf_Scn *scn = NULL;
>> + GElf_Ehdr ehdr;
>> + GElf_Shdr shdr;
>> + bool is_pie = false;
>> + char dso_path[PATH_MAX];
>> + int fd = -1;
>> +
>> + if (!dso || (elf_version(EV_CURRENT) == EV_NONE))
>> + goto exit; // false
>> +
>> + dso__build_id_filename(dso, dso_path, sizeof(dso_path), false);
>> +
>> + fd = open(dso_path, O_RDONLY);
>> +
>> + if (fd < 0) {
>> + pr_debug("%s: cannot read cached %s.\n", __func__, dso_path);
>> + goto exit; // false
>> + }
>> +
>> + elf = elf_begin(fd, ELF_C_READ, NULL);
>> + gelf_getehdr(elf, &ehdr);
>> +
>> + if (ehdr.e_type == ET_DYN) {
>> + Elf_Data *data;
>> + GElf_Dyn *entry;
>> + int n_entries = shdr.sh_size / sizeof(GElf_Dyn);
>
> I took time to play this series on Arm64 machine and found issue in above
> sentence. As the 'shdr' strucutre is read out from elf_section_by_name()
> below, but it calculates the entries before reading out the section header.
> Therefore, I observed that program will not be detected as PIE executable due
> to 'n_entries' is 0.
>
> With fixing this bug, then I observed the regression caused by patch 02.
>
> Below are steps:
>
> - Build test program with below command:
>
> # gcc -pie -Wl,-z,relro,-z,now -o test test.c
> # readelf readelf -a test | grep FLAGS
> 0x000000000000001e (FLAGS) BIND_NOW
> 0x000000006ffffffb (FLAGS_1) Flags: NOW PIE
>
> The test program is a simple infinite loop. I run this program in a docker
> container with Fedora 40.
>
> - Record trace data:
>
> # perf record -e cycles -p 149531
> # perf script
>
> test 149531 483168.078485: 4986 cycles: aaaacde401b4
> [unknown] (/home/test)
> test 149531 483168.078564: 134207 cycles: aaaacde401b4
> [unknown] (/home/test)
> test 149531 483168.079305: 1257097 cycles: aaaacde401b4
> [unknown] (/home/test)
>
> You can see it fails to parse symbol and print out [unknown].
>
> After I reverted patch 02 in this series, then:
>
> # perf script
> test 149531 483168.078485: 4986 cycles: aaaacde401b4
> main+0xc (/home/test)
> test 149531 483168.078564: 134207 cycles: aaaacde401b4
> main+0xc (/home/test)
> test 149531 483168.079305: 1257097 cycles: aaaacde401b4
> main+0xc (/home/test)
>
> Not sure if I miss anything for PIE executable, seems to me, we should drop
> the first two patches and just pass pg_off to python script?
> > Thanks,
> Leo
Hi Leo,
I verified the perf/arm-cs-trace-disasm.py script patch results for
CoreSight user and kernel trace using Fedora distros 37 and higher. I
did little to test non-CoreSight trace, but I do see the dso__is_pie
initialization error, and the problem revealed by your test program. As
a side effect, the problem would contribute to the CoreSight user
instruction test passes.
The arm-cs-trace-disasm.py python script has a legacy IF block to force
'dso_vm_start' to zero for kernel code. Similarly, map_pgoff gets forced
to zero here. See the adjacent comment pertaining to memory offsets for
kernel dso and executable file dso. This, I think, is the root of the
misunderstanding with regard to the PIE check. I see the python IF
comparison includes a check for dso_start == 0x400000 but I see no
information as to why this hardcoded value is relevant. Do you know?
perf has (or now has) the information (e.g. map__rip_2objdump,
map__objdump_2mem) needed to make all objdump address adjustments prior
to calling the script. The script could even be made redundant as
suggested in an earlier thread. In the meantime, Ampere needs this
working so the patch implementation has to work with the current
perf/script implementation.
I agree the revised approach is to pass pgoff to the updated script
unmodified.This means patch 3 will pass map_pgoff unmodified by
MAPPING_TYPE into the dictionary. Patch 1-2 goes away, and patch 4
remains as-is.
Steve
>
>> +
>> + scn = elf_section_by_name(elf, &ehdr, &shdr, ".dynamic", NULL);
>> + if (!scn)
>> + goto exit_close; // false
>> +
>> + data = (Elf_Data *) elf_getdata(scn, NULL);
>> + if (!data || !data->d_buf)
>> + goto exit_close; // false
>> +
>> + // check DT_FLAGS_1
>> + for (int i = 0; i < n_entries; i++) {
>> + entry = ((GElf_Dyn *) data->d_buf) + i;
>> + if (entry->d_tag == DT_FLAGS_1) {
>> + if ((entry->d_un.d_val & DF_1_PIE) != 0) {
>> + is_pie = true;
>> + break;
>> + }
>> + }
>> + } // end for
>> + }
>> +
>> +exit_close:
>> + elf_end(elf);
>> + close(fd);
>> +exit:
>> + return is_pie;
>> +}
>> +
>> /*
>> * We need to check if we have a .dynsym, so that we can handle the
>> * .plt, synthesizing its symbols, that aren't on the symtabs (be it
>> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
>> index 3fb5d146d9b1..33ea2596ce31 100644
>> --- a/tools/perf/util/symbol.h
>> +++ b/tools/perf/util/symbol.h
>> @@ -127,6 +127,7 @@ void dso__insert_symbol(struct dso *dso,
>> struct symbol *sym);
>> void dso__delete_symbol(struct dso *dso,
>> struct symbol *sym);
>> +bool dso__is_pie(struct dso *dso);
>>
>> struct symbol *dso__find_symbol(struct dso *dso, u64 addr);
>> struct symbol *dso__find_symbol_nocache(struct dso *dso, u64 addr);
>> --
>> 2.44.0
>>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-10-10 16:30 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-09 21:30 [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset Steve Clevenger
2024-09-09 21:29 ` [PATCH V8 4/4] Adjust objdump start/end range per map pgoff parameter Steve Clevenger
2024-09-10 8:03 ` Leo Yan
2024-09-09 21:29 ` [PATCH V8 3/4] Add map pgoff to python dictionary based on MAPPING_TYPE Steve Clevenger
2024-09-09 21:30 ` [PATCH V8 2/4] Force MAPPING_TYPE__IDENTIY for PIE Steve Clevenger
2024-09-09 21:30 ` [PATCH V8 1/4] Add dso__is_pie call to identify ELF PIE Steve Clevenger
2024-10-09 21:13 ` Leo Yan
2024-10-10 16:30 ` Steve Clevenger
2024-10-08 5:51 ` [PATCH V8 0/4] arm-cs-trace-disasm.py/perf must accommodate non-zero DSO text offset Namhyung Kim
2024-10-08 18:24 ` Steve Clevenger
2024-10-08 21:00 ` Namhyung Kim
2024-10-08 22:12 ` Steve Clevenger
2024-10-09 15:24 ` Leo Yan
2024-10-10 0:20 ` Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).