* [RFC] fix the relative jump problem on large modules
@ 2010-06-18 15:03 James Bottomley
2010-06-18 20:40 ` Helge Deller
0 siblings, 1 reply; 7+ messages in thread
From: James Bottomley @ 2010-06-18 15:03 UTC (permalink / raw)
To: linux-parisc
Part of this arguing with ksplice about their plan for
-ffunction-sections and -fdata-sections got me thinking about how we do
modules. Right at the moment we have one section for every function in
a module, which leads to a massive amount of relocation overhead in the
in-kernel module loader. Plus for some modules (ipv6, I believe), we
lack the relative jumps to get out of the function because we only put
the stubs after all the text sections.
The way to fix all of this, I think, is to make the real linker do more
work. It should be beneficial to us because the linker *should* be able
to rearrange the sections to get the maximum number of jumps satisfiable
relatively.
I've tested that this works on pa8800 systems, but I'd really like
someone to try a failing module on a 32 bit platform (since 64 bits has
22 bit relative jumps, all the modules actually work). You can see some
of the savings in the scsi_mod.ko
Before: 325 sections, 6366 relocation symbols
After: 23 sections, 5244 relocation symbols
James
---
diff --git a/arch/parisc/Makefile b/arch/parisc/Makefile
index 55cca1d..ab88f11 100644
--- a/arch/parisc/Makefile
+++ b/arch/parisc/Makefile
@@ -21,6 +21,7 @@ KBUILD_DEFCONFIG := default_defconfig
NM = sh $(srctree)/arch/parisc/nm
CHECKFLAGS += -D__hppa__=1
+LDFLAGS_MODULE += -T $(srctree)/arch/parisc/kernel/module.lds
MACHINE := $(shell uname -m)
ifeq ($(MACHINE),parisc*)
diff --git a/arch/parisc/kernel/module.lds b/arch/parisc/kernel/module.lds
new file mode 100644
index 0000000..42ee3eb
--- /dev/null
+++ b/arch/parisc/kernel/module.lds
@@ -0,0 +1,6 @@
+SECTIONS {
+ .text : {
+ /* Gather all function sections */
+ *(.text.*)
+ }
+}
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC] fix the relative jump problem on large modules
2010-06-18 15:03 [RFC] fix the relative jump problem on large modules James Bottomley
@ 2010-06-18 20:40 ` Helge Deller
2010-06-19 22:21 ` Helge Deller
0 siblings, 1 reply; 7+ messages in thread
From: Helge Deller @ 2010-06-18 20:40 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-parisc
On 06/18/2010 05:03 PM, James Bottomley wrote:
> Part of this arguing with ksplice about their plan for
> -ffunction-sections and -fdata-sections got me thinking about how we do
> modules. Right at the moment we have one section for every function in
> a module, which leads to a massive amount of relocation overhead in the
> in-kernel module loader. Plus for some modules (ipv6, I believe), we
> lack the relative jumps to get out of the function because we only put
> the stubs after all the text sections.
>
> The way to fix all of this, I think, is to make the real linker do more
> work. It should be beneficial to us because the linker *should* be able
> to rearrange the sections to get the maximum number of jumps satisfiable
> relatively.
>
> I've tested that this works on pa8800 systems, but I'd really like
> someone to try a failing module on a 32 bit platform (since 64 bits has
> 22 bit relative jumps, all the modules actually work).
Hi James,
I think there is no failing module on 32bit right now.
The biggest modules were ipv6.ko and xfs.ko, which do work now
since the latest module changes.
But if your patch saves relocations it's a win nevertheless.
I can't test your patch right now, but will try tomorrow evening....
Helge
> You can see some
> of the savings in the scsi_mod.ko
>
> Before: 325 sections, 6366 relocation symbols
> After: 23 sections, 5244 relocation symbols
>
> James
>
> ---
>
> diff --git a/arch/parisc/Makefile b/arch/parisc/Makefile
> index 55cca1d..ab88f11 100644
> --- a/arch/parisc/Makefile
> +++ b/arch/parisc/Makefile
> @@ -21,6 +21,7 @@ KBUILD_DEFCONFIG := default_defconfig
>
> NM = sh $(srctree)/arch/parisc/nm
> CHECKFLAGS += -D__hppa__=1
> +LDFLAGS_MODULE += -T $(srctree)/arch/parisc/kernel/module.lds
>
> MACHINE := $(shell uname -m)
> ifeq ($(MACHINE),parisc*)
> diff --git a/arch/parisc/kernel/module.lds b/arch/parisc/kernel/module.lds
> new file mode 100644
> index 0000000..42ee3eb
> --- /dev/null
> +++ b/arch/parisc/kernel/module.lds
> @@ -0,0 +1,6 @@
> +SECTIONS {
> + .text : {
> + /* Gather all function sections */
> + *(.text.*)
> + }
> +}
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] fix the relative jump problem on large modules
2010-06-18 20:40 ` Helge Deller
@ 2010-06-19 22:21 ` Helge Deller
2010-06-19 22:57 ` James Bottomley
0 siblings, 1 reply; 7+ messages in thread
From: Helge Deller @ 2010-06-19 22:21 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-parisc
On 06/18/2010 10:40 PM, Helge Deller wrote:
> On 06/18/2010 05:03 PM, James Bottomley wrote:
>> Part of this arguing with ksplice about their plan for
>> -ffunction-sections and -fdata-sections got me thinking about how we do
>> modules. Right at the moment we have one section for every function in
>> a module, which leads to a massive amount of relocation overhead in the
>> in-kernel module loader. Plus for some modules (ipv6, I believe), we
>> lack the relative jumps to get out of the function because we only put
>> the stubs after all the text sections.
>>
>> The way to fix all of this, I think, is to make the real linker do more
>> work. It should be beneficial to us because the linker *should* be able
>> to rearrange the sections to get the maximum number of jumps satisfiable
>> relatively.
>>
>> I've tested that this works on pa8800 systems, but I'd really like
>> someone to try a failing module on a 32 bit platform (since 64 bits has
>> 22 bit relative jumps, all the modules actually work).
>
> Hi James,
>
> I think there is no failing module on 32bit right now.
> The biggest modules were ipv6.ko and xfs.ko, which do work now
> since the latest module changes.
>
> But if your patch saves relocations it's a win nevertheless.
> I can't test your patch right now, but will try tomorrow evening....
Hi James,
I just tested your patch on a 32bit kernel.
The wins wrt module size is good:
ipv6.ko: 415K -> 357K
xfs.ko: 902K -> 747K
(btw, what is the command you ran to count the sections and relocs?).
But your patch doesn't work on 32bit.
root@c3000:~# modprobe xfs
FATAL: Error inserting xfs (/lib/modules/2.6.35-rc3-32bit+/kernel/fs/xfs/xfs.ko): Invalid module format
dmesg says:
module xfs relocation of symbol memcpy is out of range (0x3ffeffaa in 17 bits)
That's exactly the problem, and this reminded me on what my latest patch to
the linux kernel module loader on hppa did.
Just look at the weak function arch_mod_section_prepend() in arch/parisc/kernel/module.c,
and at the top of that file:
* Notes:
* - PLT stub handling
* On 32bit (and sometimes 64bit) and with big kernel modules like xfs or
* ipv6 the relocation types R_PARISC_PCREL17F and R_PARISC_PCREL22F may
* fail to reach their PLT stub if we only create one big stub array for
* all sections at the beginning of the core or init section.
* Instead we now insert individual PLT stub entries directly in front of
* of the code sections where the stubs are actually called.
* This reduces the distance between the PCREL location and the stub entry
* so that the relocations can be fulfilled.
* While calculating the final layout of the kernel module in memory, the
* kernel module loader calls arch_mod_section_prepend() to request the
* to be reserved amount of memory in front of each individual section.
So, your patch merges all text sections, which then let the new kernel
module loader fail on 32bit since it's only one big section with too long distances...
Helge
>> You can see some
>> of the savings in the scsi_mod.ko
>>
>> Before: 325 sections, 6366 relocation symbols
>> After: 23 sections, 5244 relocation symbols
>>
>> James
>>
>> ---
>>
>> diff --git a/arch/parisc/Makefile b/arch/parisc/Makefile
>> index 55cca1d..ab88f11 100644
>> --- a/arch/parisc/Makefile
>> +++ b/arch/parisc/Makefile
>> @@ -21,6 +21,7 @@ KBUILD_DEFCONFIG := default_defconfig
>>
>> NM = sh $(srctree)/arch/parisc/nm
>> CHECKFLAGS += -D__hppa__=1
>> +LDFLAGS_MODULE += -T $(srctree)/arch/parisc/kernel/module.lds
>>
>> MACHINE := $(shell uname -m)
>> ifeq ($(MACHINE),parisc*)
>> diff --git a/arch/parisc/kernel/module.lds b/arch/parisc/kernel/module.lds
>> new file mode 100644
>> index 0000000..42ee3eb
>> --- /dev/null
>> +++ b/arch/parisc/kernel/module.lds
>> @@ -0,0 +1,6 @@
>> +SECTIONS {
>> + .text : {
>> + /* Gather all function sections */
>> + *(.text.*)
>> + }
>> +}
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] fix the relative jump problem on large modules
2010-06-19 22:21 ` Helge Deller
@ 2010-06-19 22:57 ` James Bottomley
2010-06-19 23:21 ` John David Anglin
0 siblings, 1 reply; 7+ messages in thread
From: James Bottomley @ 2010-06-19 22:57 UTC (permalink / raw)
To: Helge Deller; +Cc: linux-parisc
On Sun, 2010-06-20 at 00:21 +0200, Helge Deller wrote:
> On 06/18/2010 10:40 PM, Helge Deller wrote:
> > On 06/18/2010 05:03 PM, James Bottomley wrote:
> >> Part of this arguing with ksplice about their plan for
> >> -ffunction-sections and -fdata-sections got me thinking about how we do
> >> modules. Right at the moment we have one section for every function in
> >> a module, which leads to a massive amount of relocation overhead in the
> >> in-kernel module loader. Plus for some modules (ipv6, I believe), we
> >> lack the relative jumps to get out of the function because we only put
> >> the stubs after all the text sections.
> >>
> >> The way to fix all of this, I think, is to make the real linker do more
> >> work. It should be beneficial to us because the linker *should* be able
> >> to rearrange the sections to get the maximum number of jumps satisfiable
> >> relatively.
> >>
> >> I've tested that this works on pa8800 systems, but I'd really like
> >> someone to try a failing module on a 32 bit platform (since 64 bits has
> >> 22 bit relative jumps, all the modules actually work).
> >
> > Hi James,
> >
> > I think there is no failing module on 32bit right now.
> > The biggest modules were ipv6.ko and xfs.ko, which do work now
> > since the latest module changes.
> >
> > But if your patch saves relocations it's a win nevertheless.
> > I can't test your patch right now, but will try tomorrow evening....
>
> Hi James,
>
> I just tested your patch on a 32bit kernel.
>
> The wins wrt module size is good:
> ipv6.ko: 415K -> 357K
> xfs.ko: 902K -> 747K
Actually, that's not really necessarily a win ... it's probably mostly
container code and relocations.
> (btw, what is the command you ran to count the sections and relocs?).
objdump (-r or --section-headers)
> But your patch doesn't work on 32bit.
> root@c3000:~# modprobe xfs
> FATAL: Error inserting xfs (/lib/modules/2.6.35-rc3-32bit+/kernel/fs/xfs/xfs.ko): Invalid module format
>
> dmesg says:
> module xfs relocation of symbol memcpy is out of range (0x3ffeffaa in 17 bits)
>
> That's exactly the problem, and this reminded me on what my latest patch to
> the linux kernel module loader on hppa did.
>
> Just look at the weak function arch_mod_section_prepend() in arch/parisc/kernel/module.c,
> and at the top of that file:
>
> * Notes:
> * - PLT stub handling
> * On 32bit (and sometimes 64bit) and with big kernel modules like xfs or
> * ipv6 the relocation types R_PARISC_PCREL17F and R_PARISC_PCREL22F may
> * fail to reach their PLT stub if we only create one big stub array for
> * all sections at the beginning of the core or init section.
> * Instead we now insert individual PLT stub entries directly in front of
> * of the code sections where the stubs are actually called.
> * This reduces the distance between the PCREL location and the stub entry
> * so that the relocations can be fulfilled.
> * While calculating the final layout of the kernel module in memory, the
> * kernel module loader calls arch_mod_section_prepend() to request the
> * to be reserved amount of memory in front of each individual section.
>
> So, your patch merges all text sections, which then let the new kernel
> module loader fail on 32bit since it's only one big section with too long distances...
The theory was that the linker should do the right thing and not emit a
relocation that can't reach the boundary of the text segment (i.e. it
should embed a stub between the sections as it combines them). I'll
have to fire up a 32 bit compile and see exactly what it thinks it's
doing ... I've got a nasty feeling it expects us to be able to stub
either at the beginning or the end, which the in-kernel loader doesn't.
Thanks for testing the preliminary patch, anyway.
James
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] fix the relative jump problem on large modules
2010-06-19 22:57 ` James Bottomley
@ 2010-06-19 23:21 ` John David Anglin
2010-06-20 13:55 ` James Bottomley
0 siblings, 1 reply; 7+ messages in thread
From: John David Anglin @ 2010-06-19 23:21 UTC (permalink / raw)
To: James Bottomley; +Cc: deller, linux-parisc
> The theory was that the linker should do the right thing and not emit a
> relocation that can't reach the boundary of the text segment (i.e. it
> should embed a stub between the sections as it combines them). I'll
> have to fire up a 32 bit compile and see exactly what it thinks it's
> doing ... I've got a nasty feeling it expects us to be able to stub
> either at the beginning or the end, which the in-kernel loader doesn't.
See ld --stub-group-size=N option. GCC assumes stub sections are at
the beginning and that the linker can insert stubs between input sections.
Merging text sections is likely to cause problems.
Dave
--
J. David Anglin dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] fix the relative jump problem on large modules
2010-06-19 23:21 ` John David Anglin
@ 2010-06-20 13:55 ` James Bottomley
2010-06-20 14:35 ` John David Anglin
0 siblings, 1 reply; 7+ messages in thread
From: James Bottomley @ 2010-06-20 13:55 UTC (permalink / raw)
To: John David Anglin; +Cc: deller, linux-parisc
On Sat, 2010-06-19 at 19:21 -0400, John David Anglin wrote:
> > The theory was that the linker should do the right thing and not emit a
> > relocation that can't reach the boundary of the text segment (i.e. it
> > should embed a stub between the sections as it combines them). I'll
> > have to fire up a 32 bit compile and see exactly what it thinks it's
> > doing ... I've got a nasty feeling it expects us to be able to stub
> > either at the beginning or the end, which the in-kernel loader doesn't.
>
> See ld --stub-group-size=N option.
I was assuming that was the default, like on arm, I take it it's not?
> GCC assumes stub sections are at
> the beginning and that the linker can insert stubs between input sections.
> Merging text sections is likely to cause problems.
Well, the theory was that ld was capable of more intelligent section
layout decisions than the in-kernel linker. If you're saying that's not
true, then there's probably not much point to doing this.
James
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC] fix the relative jump problem on large modules
2010-06-20 13:55 ` James Bottomley
@ 2010-06-20 14:35 ` John David Anglin
0 siblings, 0 replies; 7+ messages in thread
From: John David Anglin @ 2010-06-20 14:35 UTC (permalink / raw)
To: James Bottomley; +Cc: deller, linux-parisc
On Sun, 20 Jun 2010, James Bottomley wrote:
> On Sat, 2010-06-19 at 19:21 -0400, John David Anglin wrote:
> > > The theory was that the linker should do the right thing and not emit a
> > > relocation that can't reach the boundary of the text segment (i.e. it
> > > should embed a stub between the sections as it combines them). I'll
> > > have to fire up a 32 bit compile and see exactly what it thinks it's
> > > doing ... I've got a nasty feeling it expects us to be able to stub
> > > either at the beginning or the end, which the in-kernel loader doesn't.
> >
> > See ld --stub-group-size=N option.
>
> I was assuming that was the default, like on arm, I take it it's not?
The default N value stubs at the beginning or end. A negative value
puts stubs before the input section. There's a default value for N
which matches the value in gcc which came from hpux.
I believe that arm originally followed the parisc approach. Possibly,
their current implementation should be looked at.
> > GCC assumes stub sections are at
> > the beginning and that the linker can insert stubs between input sections.
> > Merging text sections is likely to cause problems.
>
> Well, the theory was that ld was capable of more intelligent section
> layout decisions than the in-kernel linker. If you're saying that's not
> true, then there's probably not much point to doing this.
I'm not sure it's more intelligent. The code for generating stub groups
is quite old now.
Dave
--
J. David Anglin dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-06-20 14:35 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-18 15:03 [RFC] fix the relative jump problem on large modules James Bottomley
2010-06-18 20:40 ` Helge Deller
2010-06-19 22:21 ` Helge Deller
2010-06-19 22:57 ` James Bottomley
2010-06-19 23:21 ` John David Anglin
2010-06-20 13:55 ` James Bottomley
2010-06-20 14:35 ` John David Anglin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.