* [PATCH 1/3] x86: record relocation offset
2009-12-30 3:15 [PATCH 0/3] perf_event: fix getting symbol error if kernel is relocatable Xiao Guangrong
@ 2009-12-30 3:16 ` Xiao Guangrong
2009-12-30 13:15 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 17+ messages in thread
From: Xiao Guangrong @ 2009-12-30 3:16 UTC (permalink / raw)
To: Ingo Molnar
Cc: Thomas Gleixner, H. Peter Anvin, Peter Zijlstra,
Frederic Weisbecker, Paul Mackerras, LKML
Record relocation offset, perf tools will use it
to adjust kernel symbol address
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
arch/x86/boot/compressed/head_32.S | 2 ++
arch/x86/boot/compressed/head_64.S | 3 +++
arch/x86/include/asm/bootparam.h | 3 ++-
arch/x86/kernel/asm-offsets_32.c | 1 +
arch/x86/kernel/asm-offsets_64.c | 1 +
arch/x86/kernel/cpu/perf_event.c | 4 ++++
6 files changed, 13 insertions(+), 1 deletions(-)
diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index f543b70..dc9748a 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -151,6 +151,8 @@ relocated:
movl %ebp, %ebx
subl $LOAD_PHYSICAL_ADDR, %ebx
jz 2f /* Nothing to be done if loaded at compiled addr. */
+
+ movl %ebx, BP_relocate_offset(%esi)
/*
* Process relocations.
*/
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index faff0dc..8170f32 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -90,6 +90,9 @@ ENTRY(startup_32)
addl %eax, %ebx
notl %eax
andl %eax, %ebx
+ movl %ebx, %eax
+ subl $LOAD_PHYSICAL_ADDR, %eax
+ movl %eax, BP_relocate_offset(%esi)
#else
movl $LOAD_PHYSICAL_ADDR, %ebx
#endif
diff --git a/arch/x86/include/asm/bootparam.h b/arch/x86/include/asm/bootparam.h
index 6be33d8..80b8d1f 100644
--- a/arch/x86/include/asm/bootparam.h
+++ b/arch/x86/include/asm/bootparam.h
@@ -88,7 +88,8 @@ struct boot_params {
__u8 _pad2[4]; /* 0x054 */
__u64 tboot_addr; /* 0x058 */
struct ist_info ist_info; /* 0x060 */
- __u8 _pad3[16]; /* 0x070 */
+ __s32 relocate_offset; /* 0x070 */
+ __u8 _pad3[12]; /* 0x074 */
__u8 hd0_info[16]; /* obsolete! */ /* 0x080 */
__u8 hd1_info[16]; /* obsolete! */ /* 0x090 */
struct sys_desc_table sys_desc_table; /* 0x0a0 */
diff --git a/arch/x86/kernel/asm-offsets_32.c b/arch/x86/kernel/asm-offsets_32.c
index dfdbf64..8028e3b 100644
--- a/arch/x86/kernel/asm-offsets_32.c
+++ b/arch/x86/kernel/asm-offsets_32.c
@@ -148,4 +148,5 @@ void foo(void)
OFFSET(BP_hardware_subarch, boot_params, hdr.hardware_subarch);
OFFSET(BP_version, boot_params, hdr.version);
OFFSET(BP_kernel_alignment, boot_params, hdr.kernel_alignment);
+ OFFSET(BP_relocate_offset, boot_params, relocate_offset);
}
diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c
index 4a6aeed..fbfa5f3 100644
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -127,6 +127,7 @@ int main(void)
OFFSET(BP_hardware_subarch, boot_params, hdr.hardware_subarch);
OFFSET(BP_version, boot_params, hdr.version);
OFFSET(BP_kernel_alignment, boot_params, hdr.kernel_alignment);
+ OFFSET(BP_relocate_offset, boot_params, relocate_offset);
BLANK();
DEFINE(PAGE_SIZE_asm, PAGE_SIZE);
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index c223b7e..11d2a7d 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -26,8 +26,10 @@
#include <asm/apic.h>
#include <asm/stacktrace.h>
#include <asm/nmi.h>
+#include <asm/setup.h>
static u64 perf_event_mask __read_mostly;
+static s32 relocate_offset;
/* The maximal number of PEBS events: */
#define MAX_PEBS_EVENTS 4
@@ -2173,6 +2175,8 @@ void __init init_hw_perf_events(void)
pr_info("Performance Events: ");
+ relocate_offset = boot_params.relocate_offset;
+
switch (boot_cpu_data.x86_vendor) {
case X86_VENDOR_INTEL:
err = intel_pmu_init();
--
1.6.1.2
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
2009-12-30 3:16 ` [PATCH 1/3] x86: record relocation offset Xiao Guangrong
@ 2009-12-30 13:15 ` Arnaldo Carvalho de Melo
2009-12-30 19:45 ` H. Peter Anvin
0 siblings, 1 reply; 17+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-12-30 13:15 UTC (permalink / raw)
To: Xiao Guangrong
Cc: Ingo Molnar, Thomas Gleixner, H. Peter Anvin, Peter Zijlstra,
Frederic Weisbecker, Paul Mackerras, Frank Ch. Eigler, fche LKML
Em Wed, Dec 30, 2009 at 11:16:57AM +0800, Xiao Guangrong escreveu:
> Record relocation offset, perf tools will use it
> to adjust kernel symbol address
> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
> arch/x86/boot/compressed/head_32.S | 2 ++
> arch/x86/boot/compressed/head_64.S | 3 +++
> arch/x86/include/asm/bootparam.h | 3 ++-
> arch/x86/kernel/asm-offsets_32.c | 1 +
> arch/x86/kernel/asm-offsets_64.c | 1 +
> arch/x86/kernel/cpu/perf_event.c | 4 ++++
<SNIP>
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -26,8 +26,10 @@
> #include <asm/nmi.h>
> +#include <asm/setup.h>
>
> static u64 perf_event_mask __read_mostly;
> +static s32 relocate_offset;
>
> @@ -2173,6 +2175,8 @@ void __init init_hw_perf_events(void)
> pr_info("Performance Events: ");
>
> + relocate_offset = boot_params.relocate_offset;
> switch (boot_cpu_data.x86_vendor) {
> case X86_VENDOR_INTEL:
I'm no expert on the intricacies of boot_params, but all the other hunks
seems sensible, but can't we provide a non-perf specific way of getting
the relocate_offset? I guess other tools would also love to have it.
What about systemtap, don't they solve this in some other way? Frank?
- Arnaldo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
2009-12-30 13:15 ` Arnaldo Carvalho de Melo
@ 2009-12-30 19:45 ` H. Peter Anvin
2009-12-30 20:39 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 17+ messages in thread
From: H. Peter Anvin @ 2009-12-30 19:45 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Xiao Guangrong, Ingo Molnar, Thomas Gleixner, Peter Zijlstra,
Frederic Weisbecker, Paul Mackerras, Frank Ch. Eigler, fche LKML
On 12/30/2009 05:15 AM, Arnaldo Carvalho de Melo wrote:
>
> I'm no expert on the intricacies of boot_params, but all the other hunks
> seems sensible, but can't we provide a non-perf specific way of getting
> the relocate_offset? I guess other tools would also love to have it.
>
> What about systemtap, don't they solve this in some other way? Frank?
>
I at one point proposed that boot_params should be exported in toto via
sysfs. This got rather brutally shut down as "it's just a debugging
feature" and got moved to debugfs (/debug/boot_params/data). However,
the entire boot_params structure is available there.
Regardless of the reporting method, the patch passing this in by
modifying the early assembly code, though, is more than a little
pointless. The kernel already knows where it is loaded -- obviously, by
sheer necessity -- and knows how it was itself configured, and as such
we can do this calculation in C code without modifying boot_params or
the early bootstrap.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
2009-12-30 19:45 ` H. Peter Anvin
@ 2009-12-30 20:39 ` Arnaldo Carvalho de Melo
2009-12-30 21:58 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 17+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-12-30 20:39 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Xiao Guangrong, Ingo Molnar, Thomas Gleixner, Peter Zijlstra,
Frederic Weisbecker, Paul Mackerras, Frank Ch. Eigler, fche LKML
Em Wed, Dec 30, 2009 at 11:45:30AM -0800, H. Peter Anvin escreveu:
> On 12/30/2009 05:15 AM, Arnaldo Carvalho de Melo wrote:
> > I'm no expert on the intricacies of boot_params, but all the other hunks
> > seems sensible, but can't we provide a non-perf specific way of getting
> > the relocate_offset? I guess other tools would also love to have it.
> > What about systemtap, don't they solve this in some other way? Frank?
>
> I at one point proposed that boot_params should be exported in toto via
> sysfs. This got rather brutally shut down as "it's just a debugging
> feature" and got moved to debugfs (/debug/boot_params/data). However,
> the entire boot_params structure is available there.
>
> Regardless of the reporting method, the patch passing this in by
> modifying the early assembly code, though, is more than a little
> pointless. The kernel already knows where it is loaded -- obviously, by
> sheer necessity -- and knows how it was itself configured, and as such
> we can do this calculation in C code without modifying boot_params or
> the early bootstrap.
Yeah, rereading the start of this discussion it now seems to me that
what is happening is that a valid vmlinux is found, i.e. one with the
same buildid as the buildid found in the perf.data file but then the
kernel, at the time of perf record, was relocated, not matching what is
in the vmlinux file.
So what we need to do is to figure this out at 'perf record' time and
encode this in the header, so that later, at 'perf report' time, we can
use a matching vmlinux and do the relocation (store this relocation
offset in struct map->start for the kernel map) to get the right
results.
Problem is that at 'perf record' time we may not have access to the
vmlinux file, and thus not be able to figure out the relocation applied
in that boot.
Then, at a later time, and possibly on another machine, on another arch,
we try to map back IPs to symbols, the /proc/kallsyms is completely
unrelated and we now have a vmlinux unrelocated...
So we need a way to get the relocation applied at 'perf record' time and
encode it in the perf.data header. Ideas about how to do that?
- Arnaldo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
2009-12-30 20:39 ` Arnaldo Carvalho de Melo
@ 2009-12-30 21:58 ` Arnaldo Carvalho de Melo
2009-12-30 22:22 ` James Bottomley
0 siblings, 1 reply; 17+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-12-30 21:58 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Xiao Guangrong, Ingo Molnar, Thomas Gleixner, Peter Zijlstra,
Frederic Weisbecker, Paul Mackerras, Frank Ch. Eigler,
linux-kernel, James Bottomley
Em Wed, Dec 30, 2009 at 06:39:36PM -0200, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Dec 30, 2009 at 11:45:30AM -0800, H. Peter Anvin escreveu:
> > The kernel already knows where it is loaded -- obviously, by sheer
> > necessity -- and knows how it was itself configured, and as such we
> > can do this calculation in C code without modifying boot_params or
> > the early bootstrap.
>
> Problem is that at 'perf record' time we may not have access to the
> vmlinux file, and thus not be able to figure out the relocation applied
> in that boot.
>
> Then, at a later time, and possibly on another machine, on another arch,
> we try to map back IPs to symbols, the /proc/kallsyms is completely
> unrelated and we now have a vmlinux unrelocated...
>
> So we need a way to get the relocation applied at 'perf record' time and
> encode it in the perf.data header. Ideas about how to do that?
Well, I guess we could do the _stext trick again, storing its value,
taken from /proc/kallsyms, into the perf.data header, then figuring out
the relocation by looking at its value in the vmlinux symtab.
There were concerns in the past about relying on _stext, IIRC, James?
- Arnaldo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
@ 2009-12-30 22:09 H. Peter Anvin
0 siblings, 0 replies; 17+ messages in thread
From: H. Peter Anvin @ 2009-12-30 22:09 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Xiao Guangrong, Ingo Molnar, Thomas Gleixner, Peter Zijlstra,
Frederic Weisbecker, Paul Mackerras, Frank Ch. Eigler,
linux-kernel, James Bottomley
[-- Attachment #1: Type: text/plain, Size: 1911 bytes --]
Are we concerned about virtual or physical addresses, here? I'm assuming virtual; in that case do note that we only actually relocate the kernel on 32 bits - on 64 bits the relocation is done at the page table level since we need the high map anyway.
On 32 bits one can compare any one symbol before and after relocation - it obviously doesn't matter which symbol as long as it is the same. The kernel start will be given by _text or startup_32; if that feels too "fuzzy" we could of course add a specific kernel start symbol explicitly for that purpose.
"Arnaldo Carvalho de Melo" <acme@infradead.org> wrote:
>Em Wed, Dec 30, 2009 at 06:39:36PM -0200, Arnaldo Carvalho de Melo escreveu:
>> Em Wed, Dec 30, 2009 at 11:45:30AM -0800, H. Peter Anvin escreveu:
>> > The kernel already knows where it is loaded -- obviously, by sheer
>> > necessity -- and knows how it was itself configured, and as such we
>> > can do this calculation in C code without modifying boot_params or
>> > the early bootstrap.
>>
>> Problem is that at 'perf record' time we may not have access to the
>> vmlinux file, and thus not be able to figure out the relocation applied
>> in that boot.
>>
>> Then, at a later time, and possibly on another machine, on another arch,
>> we try to map back IPs to symbols, the /proc/kallsyms is completely
>> unrelated and we now have a vmlinux unrelocated...
>>
>> So we need a way to get the relocation applied at 'perf record' time and
>> encode it in the perf.data header. Ideas about how to do that?
>
>Well, I guess we could do the _stext trick again, storing its value,
>taken from /proc/kallsyms, into the perf.data header, then figuring out
>the relocation by looking at its value in the vmlinux symtab.
>
>There were concerns in the past about relying on _stext, IIRC, James?
>
>- Arnaldo
--
Sent from my mobile phone. Please excuse any lack of formatting.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
2009-12-30 21:58 ` Arnaldo Carvalho de Melo
@ 2009-12-30 22:22 ` James Bottomley
0 siblings, 0 replies; 17+ messages in thread
From: James Bottomley @ 2009-12-30 22:22 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: H. Peter Anvin, Xiao Guangrong, Ingo Molnar, Thomas Gleixner,
Peter Zijlstra, Frederic Weisbecker, Paul Mackerras,
Frank Ch. Eigler, linux-kernel
On Wed, 2009-12-30 at 19:58 -0200, Arnaldo Carvalho de Melo wrote:
> Em Wed, Dec 30, 2009 at 06:39:36PM -0200, Arnaldo Carvalho de Melo escreveu:
> > Em Wed, Dec 30, 2009 at 11:45:30AM -0800, H. Peter Anvin escreveu:
> > > The kernel already knows where it is loaded -- obviously, by sheer
> > > necessity -- and knows how it was itself configured, and as such we
> > > can do this calculation in C code without modifying boot_params or
> > > the early bootstrap.
> >
> > Problem is that at 'perf record' time we may not have access to the
> > vmlinux file, and thus not be able to figure out the relocation applied
> > in that boot.
> >
> > Then, at a later time, and possibly on another machine, on another arch,
> > we try to map back IPs to symbols, the /proc/kallsyms is completely
> > unrelated and we now have a vmlinux unrelocated...
> >
> > So we need a way to get the relocation applied at 'perf record' time and
> > encode it in the perf.data header. Ideas about how to do that?
>
> Well, I guess we could do the _stext trick again, storing its value,
> taken from /proc/kallsyms, into the perf.data header, then figuring out
> the relocation by looking at its value in the vmlinux symtab.
So reading the thread, I think the problem only exists for x86 compiled
as a relocateable kernel.
> There were concerns in the past about relying on _stext, IIRC, James?
Well, the original concerns were that _text relative relocation
resolution only works for the core kernel, not for modules.
Additionally, the kernel is in several sections, most notably init and
runtime ... these may get loaded at different locations so _text
relative symbol resolution won't work in init sections.
Right at the moment, only x86 and ppc do a relocatable kernel, and, as I
understand the process, the whole kernel image gets a relative offset
applied, so all sections get the same offset. Thus, for this case it
looks like computing the offset from any known symbol would work
(including _text).
James
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
@ 2009-12-30 23:26 H. Peter Anvin
2009-12-30 23:41 ` James Bottomley
0 siblings, 1 reply; 17+ messages in thread
From: H. Peter Anvin @ 2009-12-30 23:26 UTC (permalink / raw)
To: James Bottomley, Arnaldo Carvalho de Melo
Cc: Xiao Guangrong, Ingo Molnar, Thomas Gleixner, Peter Zijlstra,
Frederic Weisbecker, Paul Mackerras, Frank Ch. Eigler,
linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2647 bytes --]
Modules are a completely separate thing - they are linked (not even just relocated) at insertion time, so they need to be tracked separately.
The statement that a _text-based relocation is insufficient is false. The entire x86-32 monolithic kernel is relocated as a unit. The x86-64 kernel, too, is relocated as a unit, but using the page tables, which means it always runs at the compile-time-selected virtual address.
-hpa
"James Bottomley" <James.Bottomley@suse.de> wrote:
>On Wed, 2009-12-30 at 19:58 -0200, Arnaldo Carvalho de Melo wrote:
>> Em Wed, Dec 30, 2009 at 06:39:36PM -0200, Arnaldo Carvalho de Melo escreveu:
>> > Em Wed, Dec 30, 2009 at 11:45:30AM -0800, H. Peter Anvin escreveu:
>> > > The kernel already knows where it is loaded -- obviously, by sheer
>> > > necessity -- and knows how it was itself configured, and as such we
>> > > can do this calculation in C code without modifying boot_params or
>> > > the early bootstrap.
>> >
>> > Problem is that at 'perf record' time we may not have access to the
>> > vmlinux file, and thus not be able to figure out the relocation applied
>> > in that boot.
>> >
>> > Then, at a later time, and possibly on another machine, on another arch,
>> > we try to map back IPs to symbols, the /proc/kallsyms is completely
>> > unrelated and we now have a vmlinux unrelocated...
>> >
>> > So we need a way to get the relocation applied at 'perf record' time and
>> > encode it in the perf.data header. Ideas about how to do that?
>>
>> Well, I guess we could do the _stext trick again, storing its value,
>> taken from /proc/kallsyms, into the perf.data header, then figuring out
>> the relocation by looking at its value in the vmlinux symtab.
>
>So reading the thread, I think the problem only exists for x86 compiled
>as a relocateable kernel.
>
>> There were concerns in the past about relying on _stext, IIRC, James?
>
>Well, the original concerns were that _text relative relocation
>resolution only works for the core kernel, not for modules.
>Additionally, the kernel is in several sections, most notably init and
>runtime ... these may get loaded at different locations so _text
>relative symbol resolution won't work in init sections.
>
>Right at the moment, only x86 and ppc do a relocatable kernel, and, as I
>understand the process, the whole kernel image gets a relative offset
>applied, so all sections get the same offset. Thus, for this case it
>looks like computing the offset from any known symbol would work
>(including _text).
>
>James
>
>
>
>
--
Sent from my mobile phone. Please excuse any lack of formatting.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
2009-12-30 23:26 [PATCH 1/3] x86: record relocation offset H. Peter Anvin
@ 2009-12-30 23:41 ` James Bottomley
2009-12-30 23:46 ` H. Peter Anvin
2009-12-31 0:53 ` Frank Ch. Eigler
0 siblings, 2 replies; 17+ messages in thread
From: James Bottomley @ 2009-12-30 23:41 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Arnaldo Carvalho de Melo, Xiao Guangrong, Ingo Molnar,
Thomas Gleixner, Peter Zijlstra, Frederic Weisbecker,
Paul Mackerras, Frank Ch. Eigler, linux-kernel
On Wed, 2009-12-30 at 15:26 -0800, H. Peter Anvin wrote:
> Modules are a completely separate thing - they are linked (not even
> just relocated) at insertion time, so they need to be tracked
> separately.
The reasons I gave was why _text relocation didn't work properly for
systemtap. The first paragraph was just giving a precis of history
explaining to Arnaldo why he remembered there was a problem with _text
based relocations.
> The statement that a _text-based relocation is insufficient is false.
> The entire x86-32 monolithic kernel is relocated as a unit. The
> x86-64 kernel, too, is relocated as a unit, but using the page tables,
> which means it always runs at the compile-time-selected virtual
> address.
Confused now ... you just repeated what I said in the second paragraph,
but made it sound like you are disagreeing?
James
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
2009-12-30 23:41 ` James Bottomley
@ 2009-12-30 23:46 ` H. Peter Anvin
2009-12-31 0:30 ` Arnaldo Carvalho de Melo
2009-12-31 2:58 ` Xiao Guangrong
2009-12-31 0:53 ` Frank Ch. Eigler
1 sibling, 2 replies; 17+ messages in thread
From: H. Peter Anvin @ 2009-12-30 23:46 UTC (permalink / raw)
To: James Bottomley
Cc: Arnaldo Carvalho de Melo, Xiao Guangrong, Ingo Molnar,
Thomas Gleixner, Peter Zijlstra, Frederic Weisbecker,
Paul Mackerras, Frank Ch. Eigler, linux-kernel
On 12/30/2009 03:41 PM, James Bottomley wrote:
> On Wed, 2009-12-30 at 15:26 -0800, H. Peter Anvin wrote:
>> Modules are a completely separate thing - they are linked (not even
>> just relocated) at insertion time, so they need to be tracked
>> separately.
>
> The reasons I gave was why _text relocation didn't work properly for
> systemtap. The first paragraph was just giving a precis of history
> explaining to Arnaldo why he remembered there was a problem with _text
> based relocations.
>
>> The statement that a _text-based relocation is insufficient is false.
>> The entire x86-32 monolithic kernel is relocated as a unit. The
>> x86-64 kernel, too, is relocated as a unit, but using the page tables,
>> which means it always runs at the compile-time-selected virtual
>> address.
>
> Confused now ... you just repeated what I said in the second paragraph,
> but made it sound like you are disagreeing?
>
We might have a bit of a context mismatch.
The first I saw of this thread was a proposed patch that would give the
relocation offset of the monolithic kernel, both on 32 and 64 bits,
without any explanation of the usage model. As such, from my point of
view this has always been about the monolithic kernel, until your post
mentioned modules (which the proposed patch would have done nothing about.)
The monolithic kernel offset is a single scalar constant; each module,
of course, is completely different.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
2009-12-30 23:46 ` H. Peter Anvin
@ 2009-12-31 0:30 ` Arnaldo Carvalho de Melo
2009-12-31 3:00 ` Xiao Guangrong
2009-12-31 2:58 ` Xiao Guangrong
1 sibling, 1 reply; 17+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-12-31 0:30 UTC (permalink / raw)
To: H. Peter Anvin
Cc: James Bottomley, Xiao Guangrong, Ingo Molnar, Thomas Gleixner,
Peter Zijlstra, Frederic Weisbecker, Paul Mackerras,
Frank Ch. Eigler, linux-kernel
Em Wed, Dec 30, 2009 at 03:46:03PM -0800, H. Peter Anvin escreveu:
> On 12/30/2009 03:41 PM, James Bottomley wrote:
> > On Wed, 2009-12-30 at 15:26 -0800, H. Peter Anvin wrote:
> >> The statement that a _text-based relocation is insufficient is false.
> >> The entire x86-32 monolithic kernel is relocated as a unit. The
> >> x86-64 kernel, too, is relocated as a unit, but using the page tables,
> >> which means it always runs at the compile-time-selected virtual
> >> address.
> >
> > Confused now ... you just repeated what I said in the second paragraph,
> > but made it sound like you are disagreeing?
>
> We might have a bit of a context mismatch.
>
> The first I saw of this thread was a proposed patch that would give the
> relocation offset of the monolithic kernel, both on 32 and 64 bits,
> without any explanation of the usage model. As such, from my point of
> view this has always been about the monolithic kernel, until your post
> mentioned modules (which the proposed patch would have done nothing about.)
>
> The monolithic kernel offset is a single scalar constant; each module,
> of course, is completely different.
Conclusion: at 'perf record' time store the address of a well know
symbol (_text) into the perf.data header. Later, at perf report time, if
using a vmlinux file, calculate the relocation by subtracting the same
well known symbol from the one stored in the header.
So no need for ioctl or boot stuff.
I'll do that tomorrow, if Xiao doesn't beats me to it :-)
- Arnaldo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
2009-12-30 23:41 ` James Bottomley
2009-12-30 23:46 ` H. Peter Anvin
@ 2009-12-31 0:53 ` Frank Ch. Eigler
1 sibling, 0 replies; 17+ messages in thread
From: Frank Ch. Eigler @ 2009-12-31 0:53 UTC (permalink / raw)
To: James Bottomley
Cc: H. Peter Anvin, Arnaldo Carvalho de Melo, Xiao Guangrong,
Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Frederic Weisbecker,
Paul Mackerras, linux-kernel
Hi -
james wrote:
> The reasons I gave was why _text relocation didn't work properly for
> systemtap. The first paragraph was just giving a precis of history
> [...]
The issues you listed (kernel .init sections, modules) have nothing to
do with kernel relocation, and systemtap has apprx. never had problem
calculating relevant addresses in their presence. The systemtap
problem you are probably thinking of relates to the kernel's former
inability to insert kprobes within .init sections.
- FChE
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
2009-12-30 23:46 ` H. Peter Anvin
2009-12-31 0:30 ` Arnaldo Carvalho de Melo
@ 2009-12-31 2:58 ` Xiao Guangrong
1 sibling, 0 replies; 17+ messages in thread
From: Xiao Guangrong @ 2009-12-31 2:58 UTC (permalink / raw)
To: H. Peter Anvin
Cc: James Bottomley, Arnaldo Carvalho de Melo, Ingo Molnar,
Thomas Gleixner, Peter Zijlstra, Frederic Weisbecker,
Paul Mackerras, Frank Ch. Eigler, linux-kernel
Hi Peter,
Thanks for you review and tell us the better way to get relocation offset.
H. Peter Anvin wrote:
>
> The first I saw of this thread was a proposed patch that would give the
> relocation offset of the monolithic kernel, both on 32 and 64 bits,
> without any explanation of the usage model. As such, from my point of
> view this has always been about the monolithic kernel, until your post
> mentioned modules (which the proposed patch would have done nothing about.)
>
We no need care modules symbols since we get module load address from
'/proc/modules', no matter is relocated or not. And perf tools just use
this way. So, it done nothing about it in my patch, maybe i should mention
it in my patch's changlog
Thanks,
Xiao
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
2009-12-31 0:30 ` Arnaldo Carvalho de Melo
@ 2009-12-31 3:00 ` Xiao Guangrong
2009-12-31 10:36 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 17+ messages in thread
From: Xiao Guangrong @ 2009-12-31 3:00 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: H. Peter Anvin, James Bottomley, Ingo Molnar, Thomas Gleixner,
Peter Zijlstra, Frederic Weisbecker, Paul Mackerras,
Frank Ch. Eigler, linux-kernel
Arnaldo Carvalho de Melo wrote:
>
> Conclusion: at 'perf record' time store the address of a well know
> symbol (_text) into the perf.data header. Later, at perf report time, if
> using a vmlinux file, calculate the relocation by subtracting the same
> well known symbol from the one stored in the header.
>
> So no need for ioctl or boot stuff.
>
I'm little confused, how to get the load symbol address?
It's not a good way, if you get it from '/proc/kallsyms', we can't assume kernel
has this file.
> I'll do that tomorrow, if Xiao doesn't beats me to it :-)
>
Of course, please do if you have a better way :-)
Thanks,
Xiao
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
2009-12-31 3:00 ` Xiao Guangrong
@ 2009-12-31 10:36 ` Arnaldo Carvalho de Melo
2009-12-31 10:50 ` Xiao Guangrong
2010-01-01 9:27 ` Ingo Molnar
0 siblings, 2 replies; 17+ messages in thread
From: Arnaldo Carvalho de Melo @ 2009-12-31 10:36 UTC (permalink / raw)
To: Xiao Guangrong
Cc: H. Peter Anvin, James Bottomley, Ingo Molnar, Thomas Gleixner,
Peter Zijlstra, Frederic Weisbecker, Paul Mackerras,
Frank Ch. Eigler, linux-kernel
Em Thu, Dec 31, 2009 at 11:00:00AM +0800, Xiao Guangrong escreveu:
> Arnaldo Carvalho de Melo wrote:
> > Conclusion: at 'perf record' time store the address of a well know
> > symbol (_text) into the perf.data header. Later, at perf report time, if
> > using a vmlinux file, calculate the relocation by subtracting the same
> > well known symbol from the one stored in the header.
> > So no need for ioctl or boot stuff.
> I'm little confused, how to get the load symbol address?
> It's not a good way, if you get it from '/proc/kallsyms', we can't assume kernel
> has this file.
Well, then its just a matter of exposing _text as
/sys/kernel/sections/.text, as we already have for modules:
[acme@ana linux-2.6-tip]$ cat /sys/module/ipv6/sections/.text
0xfa0c2000
Which matches
nf_conntrack_ipv6 17548 2 - Live 0xfa147000
ipv6 239420 32 ip6t_REJECT,nf_conntrack_ipv6, Live 0xfa0c2000
[acme@ana linux-2.6-tip]$
But even as a quick transational assist, we can look at kallsyms at
'perf record' time.
> > I'll do that tomorrow, if Xiao doesn't beats me to it :-)
> Of course, please do if you have a better way :-)
I meant, if you didn't write the patch first, while I was sleeping :-)
I'll work on it today after some coffee and errands.
Best Regards,
- Arnaldo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
2009-12-31 10:36 ` Arnaldo Carvalho de Melo
@ 2009-12-31 10:50 ` Xiao Guangrong
2010-01-01 9:27 ` Ingo Molnar
1 sibling, 0 replies; 17+ messages in thread
From: Xiao Guangrong @ 2009-12-31 10:50 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: H. Peter Anvin, James Bottomley, Ingo Molnar, Thomas Gleixner,
Peter Zijlstra, Frederic Weisbecker, Paul Mackerras,
Frank Ch. Eigler, linux-kernel
Arnaldo Carvalho de Melo wrote:
>
> Well, then its just a matter of exposing _text as
> /sys/kernel/sections/.text, as we already have for modules:
>
> [acme@ana linux-2.6-tip]$ cat /sys/module/ipv6/sections/.text
> 0xfa0c2000
>
> Which matches
>
> nf_conntrack_ipv6 17548 2 - Live 0xfa147000
> ipv6 239420 32 ip6t_REJECT,nf_conntrack_ipv6, Live 0xfa0c2000
> [acme@ana linux-2.6-tip]$
>
> But even as a quick transational assist, we can look at kallsyms at
> 'perf record' time.
>
Ah, i see, it's really a nice way.
>>> I'll do that tomorrow, if Xiao doesn't beats me to it :-)
>
>> Of course, please do if you have a better way :-)
>
> I meant, if you didn't write the patch first, while I was sleeping :-)
>
> I'll work on it today after some coffee and errands.
>
Please do :-)
Xiao
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/3] x86: record relocation offset
2009-12-31 10:36 ` Arnaldo Carvalho de Melo
2009-12-31 10:50 ` Xiao Guangrong
@ 2010-01-01 9:27 ` Ingo Molnar
1 sibling, 0 replies; 17+ messages in thread
From: Ingo Molnar @ 2010-01-01 9:27 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Xiao Guangrong, H. Peter Anvin, James Bottomley, Thomas Gleixner,
Peter Zijlstra, Frederic Weisbecker, Paul Mackerras,
Frank Ch. Eigler, linux-kernel
* Arnaldo Carvalho de Melo <acme@infradead.org> wrote:
> Em Thu, Dec 31, 2009 at 11:00:00AM +0800, Xiao Guangrong escreveu:
> > Arnaldo Carvalho de Melo wrote:
>
> > > Conclusion: at 'perf record' time store the address of a well know
> > > symbol (_text) into the perf.data header. Later, at perf report time, if
> > > using a vmlinux file, calculate the relocation by subtracting the same
> > > well known symbol from the one stored in the header.
>
> > > So no need for ioctl or boot stuff.
>
> > I'm little confused, how to get the load symbol address?
> > It's not a good way, if you get it from '/proc/kallsyms', we can't assume kernel
> > has this file.
>
> Well, then its just a matter of exposing _text as
> /sys/kernel/sections/.text, as we already have for modules:
>
> [acme@ana linux-2.6-tip]$ cat /sys/module/ipv6/sections/.text
> 0xfa0c2000
>
> Which matches
>
> nf_conntrack_ipv6 17548 2 - Live 0xfa147000
> ipv6 239420 32 ip6t_REJECT,nf_conntrack_ipv6, Live 0xfa0c2000
> [acme@ana linux-2.6-tip]$
Yeah, that's a good idea and pretty complementary to the existing scheme for
modules.
> But even as a quick transational assist, we can look at kallsyms at 'perf
> record' time.
Yes, we should do that as a fallback mechanism.
(Initially this 'fallback' will be the primary method, until the
sections/.text extension gets merged upstream.)
Thanks,
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2010-01-01 9:28 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-30 23:26 [PATCH 1/3] x86: record relocation offset H. Peter Anvin
2009-12-30 23:41 ` James Bottomley
2009-12-30 23:46 ` H. Peter Anvin
2009-12-31 0:30 ` Arnaldo Carvalho de Melo
2009-12-31 3:00 ` Xiao Guangrong
2009-12-31 10:36 ` Arnaldo Carvalho de Melo
2009-12-31 10:50 ` Xiao Guangrong
2010-01-01 9:27 ` Ingo Molnar
2009-12-31 2:58 ` Xiao Guangrong
2009-12-31 0:53 ` Frank Ch. Eigler
-- strict thread matches above, loose matches on Subject: below --
2009-12-30 22:09 H. Peter Anvin
2009-12-30 3:15 [PATCH 0/3] perf_event: fix getting symbol error if kernel is relocatable Xiao Guangrong
2009-12-30 3:16 ` [PATCH 1/3] x86: record relocation offset Xiao Guangrong
2009-12-30 13:15 ` Arnaldo Carvalho de Melo
2009-12-30 19:45 ` H. Peter Anvin
2009-12-30 20:39 ` Arnaldo Carvalho de Melo
2009-12-30 21:58 ` Arnaldo Carvalho de Melo
2009-12-30 22:22 ` James Bottomley
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox