From: Fangrui Song <maskray@google.com>
To: Ian Rogers <irogers@google.com>
Cc: Leo Yan <leo.yan@linaro.org>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Namhyung Kim <namhyung@kernel.org>,
Ingo Molnar <mingo@redhat.com>,
Mark Rutland <mark.rutland@arm.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>,
linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
Chang Rui <changruinj@gmail.com>
Subject: Re: [RFC PATCH v1] perf symbol: Correct address for bss symbols
Date: Mon, 11 Jul 2022 10:27:06 -0700 [thread overview]
Message-ID: <20220711172706.rtfd6pp2pochmdre@google.com> (raw)
In-Reply-To: <CAP-5=fX_fT7e9tqDBKXh_1CQ8w80iOXCGz2kJT_nHpY6wYWqmQ@mail.gmail.com>
On 2022-07-11, Ian Rogers wrote:
>On Sat, Jul 9, 2022 at 6:22 PM Leo Yan <leo.yan@linaro.org> wrote:
>>
>> When using 'perf mem' and 'perf c2c', an issue is observed that tool
>> reports the wrong offset for global data symbols. This is a common
>> issue on both x86 and Arm64 platforms.
>>
>> Let's see an example, for a test program, below is the disassembly for
>> its .bss section which is dumped with objdump:
>>
>> ...
>>
>> Disassembly of section .data:
>>
>> 0000000000004000 <__data_start>:
>> ...
>>
>> 0000000000004008 <__dso_handle>:
>> 4008: 08 40 00 or %al,0x0(%rax)
>> 400b: 00 00 add %al,(%rax)
>> 400d: 00 00 add %al,(%rax)
>> ...
>>
>> 0000000000004010 <wait_to_begin>:
>> 4010: 01 00 add %eax,(%rax)
>> 4012: 00 00 add %al,(%rax)
>> 4014: 00 00 add %al,(%rax)
>> ...
>>
>> 0000000000004018 <lock_thd_name>:
>> 4018: 08 20 or %ah,(%rax)
>> 401a: 00 00 add %al,(%rax)
>> 401c: 00 00 add %al,(%rax)
>> ...
>>
>> 0000000000004020 <reader_thd_name>:
>> 4020: 10 20 adc %ah,(%rax)
>> 4022: 00 00 add %al,(%rax)
>> 4024: 00 00 add %al,(%rax)
>> ...
>>
>> Disassembly of section .bss:
>>
>> 0000000000004040 <completed.0>:
>> ...
>>
>> 0000000000004080 <buf1>:
>> ...
>>
>> 00000000000040c0 <buf2>:
>> ...
>>
>> 0000000000004100 <thread>:
>> ...
>>
>> First we used 'perf mem record' to run the test program and then used
>> 'perf --debug verbose=4 mem report' to observe what's the symbol info
>> for 'buf1' and 'buf2' structures.
>>
>> # ./perf mem record -e ldlat-loads,ldlat-stores -- false_sharing.exe 8
>> # ./perf --debug verbose=4 mem report
>> ...
>> dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 sh_addr: 0x4040 sh_offset: 0x3028
>> symbol__new: buf2 0x30a8-0x30e8
>> ...
>> dso__load_sym_internal: adjusting symbol: st_value: 0x4080 sh_addr: 0x4040 sh_offset: 0x3028
>> symbol__new: buf1 0x3068-0x30a8
>> ...
It seems unclear how 0x30a8 and 0x3068 are derived,
>> Perf tool relies on libelf to parse symbols, here 'st_value' is the
>> address from executable file, 'sh_addr' is the belonged section's linked
>> start address, and 'sh_offset' is the dynamic loaded address for this
>> section, then perf tool uses below formula to adjust symbol address:
>>
>> adjusted_address = st_value - sh_addr + sh_offset
>>
>> So we can see the final adjusted address ranges for buf1 and buf2 are
>> [0x30a8-0x30e8) and [0x3068-0x30a8) respectively, apparently this is
>> incorrect, in the code, the structure for 'buf1' and 'buf2' specifies
>> compiler attribute with 64-byte alignment.
so I cannot judge this paragraph.
>> The problem happens for 'sh_offset', libelf returns it as 0x3028 which
>> is not 64-byte aligned, on the other hand, we can see both 'st_value'
>> and 'sh_addr' are 64-byte aligned. Combining with disassembly, it's
>> likely libelf uses the .data section end address as .bss section
>> start address, therefore, it doesn't respect the alignment attribute for
>> structures in .bss section.
>>
>> Since .data and .bss sections are in the continuous virtual address
>> space, and .data section info returned by libelf is reliable, to fix
>> this issue, if detects it's a bss symbol, it rolls back to use .data
>> section info to adjust symbol's virtual address.
This is not necessarily true.
* In GNU ld's internal linker script, .data1 sits between .data and .bss.
* A linker script can add other sections between .data and .bss
* A linker script may place .data and .bss in two PT_LOAD program headers.
% readelf -WS aa
There are 13 section headers, starting at offset 0x10a8:
With a linker script like
% cat a/a.lds
SECTIONS {
.text : { *(.text) }
data1 : { *(data1) }
data2 : { *(data2) }
.bss : { *(.bss) }
}
I can get something like
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 0000000000000000 001000 000001 00 AX 0 0 1
[ 2] data1 PROGBITS 0000000000000001 001001 000001 00 WA 0 0 1
[ 3] data2 PROGBITS 0000000000000002 001002 000001 00 WA 0 0 1
[ 4] .data PROGBITS 0000000000000003 001003 000000 00 WA 0 0 1
[ 5] data3 PROGBITS 0000000000000003 001003 000001 00 WA 0 0 1
[ 6] .bss NOBITS 0000000000000020 001004 000001 00 WA 0 0 32
.bss's sh_offset does not need to be aligned per http://www.sco.com/developers/gabi/latest/ch4.sheader.html
sh_offset
This member's value gives the byte offset from the beginning of the file to the first byte in the section. One section type, SHT_NOBITS described below, occupies no space in the file, and its sh_offset member locates the conceptual placement in the file.
I don't have more context why the file offset is needed for a variable in the all-zero section.
If the file offset has to be used and we want to use a heuristic, a better one is to find the section index of .bss, say, i.
const uint64_t align = shdr[i].sh_addralign;
assert(i > 0);
if (shdr[i].offset % align == 0)
return shdr[i].offset;
return (shdr[i-1].sh_offset + shdr[i-1].sh_size + align - 1) & -align;
Really, it is better to use the program header to derive the virtual address of a variable residing in .bss.
>> Essentially, we need to fix libelf to return correct offsets for
>> sections, on the other hand, we live commonly with existed versions of
>> libelf. So we also need this change in perf tool.
>>
>> Fixes: f17e04afaff8 ("perf report: Fix ELF symbol parsing")
>> Reported-by: Chang Rui <changruinj@gmail.com>
>> Signed-off-by: Leo Yan <leo.yan@linaro.org>
>
>This looks good to me, I'm happy to add my:
>Acked-by: Ian Rogers <irogers@google.com>
>I've added Fangrui Song who is more knowledge-able on ELF, libelf,
>etc. than me and may have additional thoughts.
>
>Thanks,
>Ian
>
>> ---
>> tools/perf/util/dso.h | 1 +
>> tools/perf/util/symbol-elf.c | 26 ++++++++++++++++++++++++++
>> 2 files changed, 27 insertions(+)
>>
>> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
>> index 3a9fd4d389b5..00f57f4ac6bc 100644
>> --- a/tools/perf/util/dso.h
>> +++ b/tools/perf/util/dso.h
>> @@ -180,6 +180,7 @@ struct dso {
>> u8 rel;
>> struct build_id bid;
>> u64 text_offset;
>> + int data_sec_index;
>> const char *short_name;
>> const char *long_name;
>> u16 long_name_len;
>> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
>> index ecd377938eea..ed65dd26d58e 100644
>> --- a/tools/perf/util/symbol-elf.c
>> +++ b/tools/perf/util/symbol-elf.c
>> @@ -1095,6 +1095,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
>> Elf *elf;
>> int nr = 0;
>> bool remap_kernel = false, adjust_kernel_syms = false;
>> + size_t sec_index;
>>
>> if (kmap && !kmaps)
>> return -1;
>> @@ -1113,6 +1114,10 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
>> ".text", NULL))
>> dso->text_offset = tshdr.sh_addr - tshdr.sh_offset;
>>
>> + if (elf_section_by_name(runtime_ss->elf, &runtime_ss->ehdr, &tshdr,
>> + ".data", &sec_index))
>> + dso->data_sec_index = sec_index;
>> +
>> if (runtime_ss->opdsec)
>> opddata = elf_rawdata(runtime_ss->opdsec, NULL);
>>
>> @@ -1227,6 +1232,27 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
>>
>> gelf_getshdr(sec, &shdr);
>>
>> + /*
>> + * When the first data structure in .bss section is attributed
>> + * with alignment (e.g. 64-byte aligned), libelf doesn't reflect
>> + * the alignment in the 'shdr.sh_offset' field, at the end the
>> + * field is filled with the end loading address of a prior
>> + * section rather than the aligned address of .bss section.
>> + * This leads to mess for later parsing .bss symbols.
>> + *
>> + * Since .data and .bss sections are in the continuous virtual
>> + * address space, and .data section's info is reliable. So if
>> + * detects it's a bss symbol, we retrieve .data section info
>> + * for adjusting address.
>> + */
>> + if (!strcmp(elf_sec__name(&shdr, secstrs_sym), ".bss")) {
>> + sec = elf_getscn(syms_ss->elf, dso->data_sec_index);
>> + if (!sec)
>> + goto out_elf_end;
>> +
>> + gelf_getshdr(sec, &shdr);
>> + }
>> +
>> secstrs = secstrs_sym;
>>
>> /*
>> --
>> 2.25.1
>>
next prev parent reply other threads:[~2022-07-11 17:27 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-10 1:22 [RFC PATCH v1] perf symbol: Correct address for bss symbols Leo Yan
2022-07-11 16:09 ` Ian Rogers
2022-07-11 17:27 ` Fangrui Song [this message]
2022-07-12 4:05 ` Leo Yan
2022-07-13 3:29 ` Fangrui Song
2022-07-24 2:38 ` Leo Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220711172706.rtfd6pp2pochmdre@google.com \
--to=maskray@google.com \
--cc=acme@kernel.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=changruinj@gmail.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=leo.yan@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).