* [PATCH 00/05] robust per_cpu allocation for modules
@ 2006-04-14 21:18 Steven Rostedt
2006-04-14 22:06 ` Andrew Morton
` (4 more replies)
0 siblings, 5 replies; 31+ messages in thread
From: Steven Rostedt @ 2006-04-14 21:18 UTC (permalink / raw)
To: LKML, Andrew Morton
Cc: linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro,
Joe Taylor, linuxppc-dev, paulus, benedict.gaster, bjornw,
Ingo Molnar, grundler, starvik, Linus Torvalds, Thomas Gleixner,
rth, chris, tony.luck, Andi Kleen, ralf, Marc Gauthier, lethal,
schwidefsky, linux390, davem, parisc-linux
The current method of allocating space for per_cpu variables in modules
is not robust and consumes quite a bit of space.
per_cpu variables:
The per_cpu variables are declared by code that needs to have variables
spaced out by cache lines on SMP machines, such that, writing to any of
these variables on one CPU wont be in danger of writing into a cache
line of a global variable shared by other CPUs. If this were to happen,
the performance would go down by having the CPUs unnecessarily needing
to update cache lines across CPUs for even read only global variables.
To solve this, a developer needs only to declare a per_cpu variable
using the DECLARE_PER_CPU(type, var) macro. This would then place the
variable into the .data.percpu section. On boot up, an area is
allocated by the size of this section + PERCPU_ENOUGH_ROOM (mentioned
later) times NR_CPUS. Then the .data.percpu section is copied into this
area once for NR_CPUS. The .data.percpu section is later discarded (the
variables now exist in the allocated area).
The __per_cpu_offset[] array holds the difference between
the .data.percpu section and the location where the data is actually
stored. __per_cpu_offset[0] holds the difference for the variables
assigned to cpu 0, __per_cpu_offset[1] holds the difference for the
variables to cpu 1, and so on.
To access a per_cpu variable, the per_cpu(var, cpu) macro is used. This
macro returns the address of the variable (still pointing to the
discarded .data.percpu section) plus the __per_cpu_offset[cpu]. So the
result is the location to the actual variable for the specified CPU
located in the allocated area.
Modules:
Since there is no way to know from per_cpu if the variable was part of a
module, or part of the kernel, the variables for the module need to be
located in the same allocated area as the per_cpu variables created in
the kernel.
Why is that?
The per_cpu variables are used in the kernel basically like normal
variables. For example:
with:
DEFINE_PER_CPU(int, myint);
we can do the following:
per_cpu(myint, cpu) = 4;
int i = per_cpu(myint, cpu);
int *i = &per_cpu(myint, cpu);
Not to mention that we can export these variables as well so that a
module can be using a per_cpu variable from the kernel, or even declared
in another module and exported (the net code does this).
Now remember, the variables are still located in the discarded sections,
but their content is in allocated space offset per cpu. We have a
single array storing these offsets (__per_cpu_offset). So this makes it
very difficult to define special DEFINE/DECLARE_PER_CPU macros and use
the CONFIG_MODULE to play magic in figuring things out. Mainly because
we have one per_cpu macro that can be used in a module referencing
per_cpu variables declared in the kernel, declared in the given module,
or even declared in another module.
PERCPU_ENOUGH_ROOM:
When you configure an SMP kernel with loadable modules, the kernel needs
to take an aggressive stance and preallocate enough room to hold the
per_cpu variables in all the modules that could be loaded. To make
matters worst, this space is allocated per cpu! So if you have a 64
processor machine with loadable modules, you are allocating extra space
for each of the 64 CPUs even if you never load a module that has a
per_cpu variable in it!
Currently PERCPU_ENOUGH_ROOM is defined as 32768 (32K). On my 2x intel
SMP machine, with my normal configuration, using 2.6.17-rc1, the size
of .data.percpu is 17892 (17K). So the extra space for the modules is
32768 - 17892 = 14876 (14K). Now this is needed for every CPU so I am
actually using
14876 * 2 = 29752 (or 29K).
Now looking at the modules that I have loaded, none of them had
a .data.percpu section defined, so that 29K was a complete waste!
So the current solution has two flaws:
1. not robust. If we someday add more modules that together take up
more than 14K, we need to manually update the PERCPU_ENOUGH_ROOM.
2. waste of memory. We have 14K of memory wasted per CPU. Remember
a 64 processor machine would be wasting 896K of memory!
A solution:
I spent some time trying to come up with a solution to all this.
Something that wouldn't be too intrusive to the way things already work.
I received nice input from Andi Kleen and Thomas Gleixner. I first
tried to use the __builtin_choose_expr and __builtin_types_compatible_p
to determine if a variable is from the kernel or modules at compile
time. But unfortunately, I've been told that makes things too complex,
but even worst it had "show stopping" flaws.
Ideally this could be resolved at link time of the module, but that too
would require looking into the relocation tables which are different for
every architecture. This would be too intrusive, and prone to bugs.
So I went for a much simpler solution. This solution is not optimal in
saving space, but it does much better than what is currently
implemented, and is still easy to understand and manage, which alone may
outweigh an optimal space solution.
First off, if CONFIG_SMP or CONFIG_MODULES is not set, the solution is
the same as it currently is. So my solution only affects the kernel if
both CONFIG_SMP and CONFIG_MODULES are set (this is the same
configuration that wastes the memory in the current implementation).
I created a new section called, .data.percpu_offset. This section will
hold a pointer for every variable that is declared as per_cpu with
DEFINE_PER_CPU. Although this wastes space too, the amount of space
needed for my setup (the same configuration that wastes 14K per cpu) is
4368 (4K). Since this section is not copied for every CPU, this saves
us 10K for the first cpu (14 - 4) and 14K for every CPU after that! So
this saves on my setup 24K. (Note: I noticed that I used the default
NR_CPUS which is 8, so this really saved me 108K).
The data in .data.percpu_offset holds is referenced by the per_cpu
variable name which points to the __per_cpu_offset array. For modules,
it will point to the per_cpu_offset array of the module.
Example:
DEFINE_PER_CPU(int, myint);
would now create a variable called per_cpu_offset__myint in
the .data.percpu_offset section. This variable will point to the (if
defined in the kernel) __per_cpu_offset[] array. If this was a module
variable, it would point to the module per_cpu_offset[] array which is
created when the modules is loaded.
So now I get rid of the PERCPU_ENOUGH_ROOM constant and some of the
complexity in kernel/module.c that shares code with the kernel, and each
module has it's own allocation of per_cpu data. And this means the
per_cpu data is more robust (can handle future changes in the modules)
and saves up space.
Draw backs:
The one draw back I have on this, is because the DECLARE_PER_CPU macro
declares two variables now, you can't declare a "static DEFINE_PER_CPU".
So instead I created a DEFINE_STATIC_PER_CPU macro to handle this case.
The following patch set is against 2.6.17-rc1, but this patch set is
currently only for i386. I have a x86_64 that I can work on to port,
but I will need the help of others to port to some other archs, mostly
the other 64 bit archs. I tried to CC the maintainers of the other
archs (those listed in the vmlinux.lds, include/asm-<arch>/percpu.h
files and the MAINTAINER file).
I'm not going to spam the CC list (nor Andrew) with the rest of the
patches (only 5). Please see LKML for the rest.
-- Steve
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-14 21:18 [PATCH 00/05] robust per_cpu allocation for modules Steven Rostedt @ 2006-04-14 22:06 ` Andrew Morton 2006-04-14 22:12 ` Steven Rostedt 2006-04-14 22:12 ` Chen, Kenneth W ` (3 subsequent siblings) 4 siblings, 1 reply; 31+ messages in thread From: Andrew Morton @ 2006-04-14 22:06 UTC (permalink / raw) To: Steven Rostedt Cc: linux-mips, davidm, linux-ia64, mj, spyro, joe, ak, linuxppc-dev, paulus, benedict.gaster, bjornw, mingo, grundler, starvik, torvalds, tglx, rth, chris, tony.luck, linux-kernel, ralf, marc, lethal, schwidefsky, linux390, davem, parisc-linux Steven Rostedt <rostedt@goodmis.org> wrote: > > Example: > > DEFINE_PER_CPU(int, myint); > > would now create a variable called per_cpu_offset__myint in > the .data.percpu_offset section. Suppose two .c files each have DEFINE_STATIC_PER_CPU(myint) Do we end up with two per_cpu_offset__myint's in the same section? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-14 22:06 ` Andrew Morton @ 2006-04-14 22:12 ` Steven Rostedt 0 siblings, 0 replies; 31+ messages in thread From: Steven Rostedt @ 2006-04-14 22:12 UTC (permalink / raw) To: Andrew Morton Cc: linux-mips, davidm, linux-ia64, mj, spyro, joe, ak, linuxppc-dev, paulus, benedict.gaster, bjornw, mingo, grundler, starvik, torvalds, tglx, rth, chris, tony.luck, linux-kernel, ralf, marc, lethal, schwidefsky, linux390, davem, parisc-linux On Fri, 14 Apr 2006, Andrew Morton wrote: > Steven Rostedt <rostedt@goodmis.org> wrote: > > > > Example: > > > > DEFINE_PER_CPU(int, myint); > > > > would now create a variable called per_cpu_offset__myint in > > the .data.percpu_offset section. > > Suppose two .c files each have > > DEFINE_STATIC_PER_CPU(myint) > > Do we end up with two per_cpu_offset__myint's in the same section? > Both variables are defined as static: ie. #define DEFINE_STATIC_PER_CPU(type, name) \ static __attribute__((__section__(".data.percpu_offset"))) unsigned long *per_cpu_offset__##name; \ static __attribute__((__section__(".data.percpu"))) __typeof__(type) per_cpu__##name So the per_cpu_offset__myint is also static, and gcc should treat it properly. Although, yes there are probably going to be two variables named per_cpu_offset__myint in the same section, but the scope of those should only be visible by who sees the static. Works like any other variable that's static, and even the current way DEFINE_PER_CPU works with statics. Thanks, -- Steve ^ permalink raw reply [flat|nested] 31+ messages in thread
* RE: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-14 21:18 [PATCH 00/05] robust per_cpu allocation for modules Steven Rostedt 2006-04-14 22:06 ` Andrew Morton @ 2006-04-14 22:12 ` Chen, Kenneth W 2006-04-15 3:10 ` [PATCH 00/08] robust per_cpu allocation for modules - V2 Steven Rostedt ` (2 subsequent siblings) 4 siblings, 0 replies; 31+ messages in thread From: Chen, Kenneth W @ 2006-04-14 22:12 UTC (permalink / raw) To: 'Steven Rostedt', LKML, Andrew Morton Cc: linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, linuxppc-dev, paulus, benedict.gaster, bjornw, Ingo Molnar, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, chris, Luck, Tony, Andi Kleen, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux Steven Rostedt wrote on Friday, April 14, 2006 2:19 PM > So the current solution has two flaws: > 1. not robust. If we someday add more modules that together take up > more than 14K, we need to manually update the PERCPU_ENOUGH_ROOM. > 2. waste of memory. We have 14K of memory wasted per CPU. Remember > a 64 processor machine would be wasting 896K of memory! If someone who has the money to own a 64-process machine, 896K of memory is pocket change ;-) - Ken ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 00/08] robust per_cpu allocation for modules - V2 2006-04-14 21:18 [PATCH 00/05] robust per_cpu allocation for modules Steven Rostedt 2006-04-14 22:06 ` Andrew Morton 2006-04-14 22:12 ` Chen, Kenneth W @ 2006-04-15 3:10 ` Steven Rostedt 2006-04-15 5:32 ` [PATCH 00/05] robust per_cpu allocation for modules Nick Piggin 2006-04-16 6:35 ` Paul Mackerras 4 siblings, 0 replies; 31+ messages in thread From: Steven Rostedt @ 2006-04-15 3:10 UTC (permalink / raw) To: LKML Cc: Andrew Morton, linux-mips, linux-ia64, Martin Mares, spyro, Joe Taylor, linuxppc-dev, paulus, SamRavnborg, bjornw, Ingo Molnar, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, Andi Kleen, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux This is version 2 of the percpu patch set. Changes from version 1: - Created a PERCPU_OFFSET variable to use in vmlinux.lds.h (suggested by Sam Ravnborg) - Added support for x86_64 (Steven Rostedt) The support for x86_64 goes back to the asm-generic handling when both CONFIG_SMP and CONFIG_MODULES are set. This is due to the fact that the __per_cpu_offset array is no longer referenced in per_cpu, but instead a per per_cpu variable is used to find the offset. Again, the rest of the patches are only sent to the LKML. Still I need help to port this to the rest of the architectures. Thanks, -- Steve ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-14 21:18 [PATCH 00/05] robust per_cpu allocation for modules Steven Rostedt ` (2 preceding siblings ...) 2006-04-15 3:10 ` [PATCH 00/08] robust per_cpu allocation for modules - V2 Steven Rostedt @ 2006-04-15 5:32 ` Nick Piggin 2006-04-15 20:17 ` Steven Rostedt 2006-04-17 16:55 ` Christoph Lameter 2006-04-16 6:35 ` Paul Mackerras 4 siblings, 2 replies; 31+ messages in thread From: Nick Piggin @ 2006-04-15 5:32 UTC (permalink / raw) To: Steven Rostedt Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, paulus, benedict.gaster, bjornw, Ingo Molnar, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux Steven Rostedt wrote: > would now create a variable called per_cpu_offset__myint in > the .data.percpu_offset section. This variable will point to the (if > defined in the kernel) __per_cpu_offset[] array. If this was a module > variable, it would point to the module per_cpu_offset[] array which is > created when the modules is loaded. If I'm following you correctly, this adds another dependent load to a per-CPU data access, and from memory that isn't node-affine. If so, I think people with SMP and NUMA kernels would care more about performance and scalability than the few k of memory this saves. -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-15 5:32 ` [PATCH 00/05] robust per_cpu allocation for modules Nick Piggin @ 2006-04-15 20:17 ` Steven Rostedt 2006-04-16 2:47 ` Nick Piggin 2006-04-17 16:55 ` Christoph Lameter 1 sibling, 1 reply; 31+ messages in thread From: Steven Rostedt @ 2006-04-15 20:17 UTC (permalink / raw) To: Nick Piggin Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, paulus, benedict.gaster, bjornw, Ingo Molnar, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Sat, 15 Apr 2006, Nick Piggin wrote: > Steven Rostedt wrote: > > > would now create a variable called per_cpu_offset__myint in > > the .data.percpu_offset section. This variable will point to the (if > > defined in the kernel) __per_cpu_offset[] array. If this was a module > > variable, it would point to the module per_cpu_offset[] array which is > > created when the modules is loaded. > > If I'm following you correctly, this adds another dependent load > to a per-CPU data access, and from memory that isn't node-affine. > > If so, I think people with SMP and NUMA kernels would care more > about performance and scalability than the few k of memory this > saves. It's not just about saving memory, but also to make it more robust. But that's another story. Since both the offset array, and the variables are mainly read only (only written on boot up), added the fact that the added variables are in their own section. Couldn't something be done to help pre load this in a local cache, or something similar? I understand SMP issues pretty well, but NUMA is still somewhat foreign to me. -- Steve ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-15 20:17 ` Steven Rostedt @ 2006-04-16 2:47 ` Nick Piggin 2006-04-16 3:53 ` Steven Rostedt 0 siblings, 1 reply; 31+ messages in thread From: Nick Piggin @ 2006-04-16 2:47 UTC (permalink / raw) To: Steven Rostedt Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, paulus, benedict.gaster, bjornw, Ingo Molnar, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux Steven Rostedt wrote: > On Sat, 15 Apr 2006, Nick Piggin wrote: > > >>Steven Rostedt wrote: >> >> >>> would now create a variable called per_cpu_offset__myint in >>>the .data.percpu_offset section. This variable will point to the (if >>>defined in the kernel) __per_cpu_offset[] array. If this was a module >>>variable, it would point to the module per_cpu_offset[] array which is >>>created when the modules is loaded. >> >>If I'm following you correctly, this adds another dependent load >>to a per-CPU data access, and from memory that isn't node-affine. >> >>If so, I think people with SMP and NUMA kernels would care more >>about performance and scalability than the few k of memory this >>saves. > > > It's not just about saving memory, but also to make it more robust. But > that's another story. But making it slower isn't going to be popular. Why is your module using so much per-cpu memory, anyway? > > Since both the offset array, and the variables are mainly read only (only > written on boot up), added the fact that the added variables are in their > own section. Couldn't something be done to help pre load this in a local > cache, or something similar? It it would still add to the dependent loads on the critical path, so it now prevents the compiler/programmer/oooe engine from speculatively loading the __per_cpu_offset. And it does increase cache footprint of per-cpu accesses, which are supposed to be really light and substitute for [NR_CPUS] arrays. I don't think it would have been hard for the original author to make it robust... just not both fast and robust. PERCPU_ENOUGH_ROOM seems like an ugly hack at first glance, but I'm fairly sure it was a result of design choices. -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-16 2:47 ` Nick Piggin @ 2006-04-16 3:53 ` Steven Rostedt 2006-04-16 7:02 ` Paul Mackerras 2006-04-16 7:06 ` Nick Piggin 0 siblings, 2 replies; 31+ messages in thread From: Steven Rostedt @ 2006-04-16 3:53 UTC (permalink / raw) To: Nick Piggin Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, paulus, benedict.gaster, bjornw, Ingo Molnar, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Sun, 16 Apr 2006, Nick Piggin wrote: > Steven Rostedt wrote: > > > > It's not just about saving memory, but also to make it more robust. But > > that's another story. > > But making it slower isn't going to be popular. You're right and I've been thinking of modifications to fix that. These patches were to shake up ideas. > > Why is your module using so much per-cpu memory, anyway? Wasn't my module anyway. The problem appeared in the -rt patch set, when tracing was turned on. Some module was affected, and grew it's per_cpu size by quite a bit. In fact we had to increase PERCPU_ENOUGH_ROOM by up to something like 300K. > > > > > Since both the offset array, and the variables are mainly read only (only > > written on boot up), added the fact that the added variables are in their > > own section. Couldn't something be done to help pre load this in a local > > cache, or something similar? > > It it would still add to the dependent loads on the critical path, so > it now prevents the compiler/programmer/oooe engine from speculatively > loading the __per_cpu_offset. > > And it does increase cache footprint of per-cpu accesses, which are > supposed to be really light and substitute for [NR_CPUS] arrays. > > I don't think it would have been hard for the original author to make > it robust... just not both fast and robust. PERCPU_ENOUGH_ROOM seems > like an ugly hack at first glance, but I'm fairly sure it was a result > of design choices. > Yeah, and I discovered the reasons for those choices as I worked on this. I've put a little more thought into this and still think there's a solution to not slow things down. Since the per_cpu_offset section is still smaller than the PERCPU_ENOUGH_ROOM and robust, I could still copy it into a per cpu memory field, and even add the __per_cpu_offset to it. This would still save quite a bit of space. So now I'm asking for advice on some ideas that can be a work around to keep the robustness and speed. Is there a way (for archs that support it) to allocate memory in a per cpu manner. So each CPU would have its own variable table in the memory that is best of it. Then have a field (like the pda in x86_64) to point to this section, and use the linker offsets to index and find the per_cpu variables. So this solution still has one more redirection than the current solution (per_cpu_offset__##var -> __per_cpu_offset -> actual_var where as the current solution is __per_cpu_offset -> actual_var), but all the loads would be done from memory that would only be specified for a particular CPU. The generic case would still be the same as the patches I already sent, but the archs that can support it, can have something like the above. Would something like that be acceptible? Thanks, -- Steve ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-16 3:53 ` Steven Rostedt @ 2006-04-16 7:02 ` Paul Mackerras 2006-04-16 13:40 ` Steven Rostedt 2006-04-16 7:06 ` Nick Piggin 1 sibling, 1 reply; 31+ messages in thread From: Paul Mackerras @ 2006-04-16 7:02 UTC (permalink / raw) To: Steven Rostedt Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, benedict.gaster, bjornw, Ingo Molnar, Nick Piggin, grundler, rusty, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux Steven Rostedt writes: > So now I'm asking for advice on some ideas that can be a work around to > keep the robustness and speed. Ideally, what I'd like to do on powerpc is to dedicate one register to storing a per-cpu base address or offset, and be able to resolve the offset at link time, so that per-cpu variable accesses just become a register + offset memory access. (For modules, "link time" would be module load time.) We *might* be able to use some of the infrastructure that was put into gcc and binutils to support TLS (thread local storage) to achieve this. (See http://people.redhat.com/drepper/tls.pdf for some of the details of that.) Also, I've added Rusty Russell to the cc list, since he designed the per-cpu variable stuff in the first place, and would be able to explain the trade-offs that led to the PERCPU_ENOUGH_ROOM thing. (I think you're discovering them as you go, though. :) Paul. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-16 7:02 ` Paul Mackerras @ 2006-04-16 13:40 ` Steven Rostedt 2006-04-16 14:03 ` Sam Ravnborg ` (2 more replies) 0 siblings, 3 replies; 31+ messages in thread From: Steven Rostedt @ 2006-04-16 13:40 UTC (permalink / raw) To: Paul Mackerras Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, benedict.gaster, bjornw, Ingo Molnar, Nick Piggin, grundler, rusty, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Sun, 2006-04-16 at 17:02 +1000, Paul Mackerras wrote: > Steven Rostedt writes: > > > So now I'm asking for advice on some ideas that can be a work around to > > keep the robustness and speed. > > Ideally, what I'd like to do on powerpc is to dedicate one register to > storing a per-cpu base address or offset, and be able to resolve the > offset at link time, so that per-cpu variable accesses just become a > register + offset memory access. (For modules, "link time" would be > module load time.) That was my original goal too, but the per_cpu and modules has problems to solve this. > > We *might* be able to use some of the infrastructure that was put into > gcc and binutils to support TLS (thread local storage) to achieve > this. (See http://people.redhat.com/drepper/tls.pdf for some of the > details of that.) Thanks for the pointer I'll give it a read (but on Monday). > > Also, I've added Rusty Russell to the cc list, since he designed the > per-cpu variable stuff in the first place, and would be able to > explain the trade-offs that led to the PERCPU_ENOUGH_ROOM thing. (I > think you're discovering them as you go, though. :) Thanks for adding Rusty, I thought I did, but looking back to my original posts, I must have missed him. Since Rusty's on the list now, here's the issues I have already found that caused the use of PERCPU_ENOUGH_ROOM. I'll try to explain them the best I can such that others also understand the issues at hand, and Rusty can jump in and tell us where I missed. I've explained some of this in my first email, but I'll repeat it again here. I'll first explain things how they are done generic and then what I understand the x86_64 does (I believe ppc is similar). The per_cpu variables are defined with the macro DEFINE_PER_CPU(type, var) This macro just places the variable into the section .data.percpu and prepends the prefix "per_cpu__" to the variable. To use this variable in another .c file the declaration is used by the macro DECLARE_PER_CPU(type, var) This macro is simply the extern declaration of the variable with the prefix added. If this variable is to be used outside the kernel, or in the case it was declared in a module and needs to be used in other modules, it is exported with the macro EXPORT_PER_CPU_SYMBOL(var) or EXPORT_PER_CPU_SYMBOL_GPL(var) This macro is the same as their EXPORT_SYMBOL equivalents except that it adds the per_cpu__ prefix. >From the above, it can be seen that on boot up the per_cpu variables are really just allocate once in their own section .data.percpu. So the kernel now figures out the size of this section cache aligns it and then allocates (ALIGN(size,SMP_CACHE_BYTES) * NR_CPUS). It then copies the contents of the .data.percpu section into this newly allocated area NR_CPUS times. The offset for each allocation is stored in the __per_cpu_offset[] array. This offset is the difference from the start of each allocated per_cpu area to the start of the .data.percpu section. Now that the section has been copied for every CPU into it's own area, the original .data.percpu section can be discarded and freed for use elsewhere. To access the per_cpu variables the macro per_cpu(var, cpu) is used. This macro is where the magic happens. The macro adds the prefix "per_cpu__" to the var and then takes its address and adds the offset of __per_cpu_offset[cpu] to it to resolve the actual location that the variable is at. This macro is also done such that it can be used as a normal variable. For example: DEFINE_PER_CPU(int, myint); int t = per_cpu(myint, cpu); per_cpu(myint, cpu) = t; int *y = &per_cpu(myint, cpu); And it handles arrays as well. DEFINE_PER_CPU(int, myintarr[10]); per_cpu(myintarray[3], cpu) = 2; and so on. This is all fine until we add loadable module support that also uses their own per_cpu variables, and it makes it even worst that the modules too can export these variables to be used in other modules. To handle this, Rusty added a reserved area in the per_cpu allocation of PERCPU_ENOUGH_ROOM. This size is meant to hold both the kernel per_cpu variables as well as the module ones. So if CONFIG_MODULES is defined and PERCPU_ENOUGH_ROOM is greater than the size of the .data.percpu section, then the PERCPU_ENOUGH_ROOM is used in the allocation of the per_cpu area. The allocation size is PERCPU_ENOUGH_ROOM * NR_CPUS, and the offsets of each cpu area is separated by PERCPU_ENOUGH_ROOM bytes. When a module is loaded, a slightly complex algorithm is used to find and keep track of what reserved area is available, and which is not. When a module is using per_cpu data, it finds memory in this reserve and then its .data.percpu section is copied into this reserve NR_CPUS times (this isn't quite accurate, since the macro for_each_possible_cpu is used here). The reason that this is done, is that the per_cpu macro cant know whether or not the per_cpu variable was declared in a kernel or in a module. So the __pre_cpu_offset[] array offset can't be used if the module allocation is in its own separate area. Remember that this offset array stores the difference from where the variable originally was and where it is now for each cpu. You might think you could just allocate the space for this in a module since we have control of the linker to place the section anywhere we want, and then play with the difference such that the __per_cpu_offset would find the new location, but this can only work for cpu[0]. Remember that this offset array is spaced by the size of .data.percpu, so how can you guarantee to allocate the space for CPU 1 for a module that would then be offset to the location by __per_cpu_offse[1]? So the module solution cant be solved this way. My solution, was to change this by creating a new section called .data.percpu_offset. This section would hold a pointer to the __per_cpu_offset (for kernel or module) for every per_cpu variable defined. This is done by making DEFINE_PER_CPU(var,cpu) not only define the pre_cpu__##var but also a per_cpu_offset__##var. This way the per_cpu macro can use the name to find the area that the variable resides. And so modules can now allocate their own space. Now a quick description of what x86_64 does. Instead of allocating one big chunk for the per_cpu area that contains the variables for all the CPUs, it allocates one chunk per cpu in the cpu node area. So that the memory for a per_cpu of a given CPU is in an area that can be quickly received by that CPU nicely in a NUMA fashion. This is because instead of using the __pre_cpu_offset array, it uses a PDA descriptor that is used to store data for each CPU. Now my solution is still in its infancy, and can still be optimized. Ideally, we want this to be as fast as the current solution, or at least not any noticeable difference. my current solution doesn't do this, but before we strike it down, is there ways to change that and make it do so. The added space in the .data.percpu_offset is much smaller then the extra space in PERCPU_ENOUGH_ROOM, so if I need to duplicate the .data.percpu_offset, then we still save space and keep it robust where we wont need to ever worry about adjusting PERCPU_ENOUGH_ROOM. But then again, if I where to duplicate this section, then I would have the same problem finding this section as I do with finding the per_cpu__##var! :( I'll think more about this, but maybe someone else has some crazy ideas that can find a solution to this that is both fast and robust. Some ideas come in looking at gcc builtin macros and linker magic. One thing we can tell is the address of these variables, and maybe that can be used in the per_cpu macro to determine where to find the variables. Some people may think I'm stubborn in wanting to fix this, but I still think that, although it's fast, the current solution is somewhat a hack. And I still believe we can clean it up without hurting performance. Thanks for the time in reading all of this. -- Steve ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-16 13:40 ` Steven Rostedt @ 2006-04-16 14:03 ` Sam Ravnborg 2006-04-16 15:34 ` Arnd Bergmann 2006-04-17 6:47 ` Rusty Russell 2 siblings, 0 replies; 31+ messages in thread From: Sam Ravnborg @ 2006-04-16 14:03 UTC (permalink / raw) To: Steven Rostedt Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, Paul Mackerras, benedict.gaster, bjornw, Ingo Molnar, Nick Piggin, grundler, rusty, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Sun, Apr 16, 2006 at 09:40:04AM -0400, Steven Rostedt wrote: > The per_cpu variables are defined with the macro > DEFINE_PER_CPU(type, var) > > This macro just places the variable into the section .data.percpu and > prepends the prefix "per_cpu__" to the variable. > > To use this variable in another .c file the declaration is used by the > macro > DECLARE_PER_CPU(type, var) > > This macro is simply the extern declaration of the variable with the > prefix added. Suprisingly this macro shows up in ~19 .c files. Only valid usage is forward declaration of a later static definition with DEFINE_PER_CPU. arch/m32r/kernel/smp.c + arch/m32r/kernel/smpboot.c is jsut one example. Just a random comment not related to Steven's patches. Sam ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-16 13:40 ` Steven Rostedt 2006-04-16 14:03 ` Sam Ravnborg @ 2006-04-16 15:34 ` Arnd Bergmann 2006-04-16 18:03 ` Tony Luck ` (2 more replies) 2006-04-17 6:47 ` Rusty Russell 2 siblings, 3 replies; 31+ messages in thread From: Arnd Bergmann @ 2006-04-16 15:34 UTC (permalink / raw) To: Steven Rostedt Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, Paul Mackerras, benedict.gaster, bjornw, Ingo Molnar, Nick Piggin, grundler, rusty, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Sunday 16 April 2006 15:40, Steven Rostedt wrote: > I'll think more about this, but maybe someone else has some crazy ideas > that can find a solution to this that is both fast and robust. Ok, you asked for a crazy idea, you're going to get it ;-) You could take a fixed range from the vmalloc area (e.g. 1MB per cpu) and use that to remap pages on demand when you need per cpu data. #define PER_CPU_BASE 0xe000000000000000UL /* arch dependant */ #define PER_CPU_SHIFT 0x100000UL #define __per_cpu_offset(__cpu) (PER_CPU_BASE + PER_CPU_STRIDE * (__cpu)) #define per_cpu(var, cpu) (*RELOC_HIDE(&per_cpu__##var, __per_cpu_offset(cpu))) #define __get_cpu_var(var) per_cpu(var, smp_processor_id()) This is a lot like the current sparc64 implementation already is. The tricky part here is the remapping of pages. You'd need to alloc_pages_node() new pages whenever the already reserved space is not enough for the module you want to load and then map_vm_area() them into the space reserved for them. Advantages of this solution are: - no dependant load access for per_cpu() - might be flexible enough to implement a faster per_cpu_ptr() - can be combined with ia64-style per-cpu remapping Disadvantages are: - you can't use huge tlbs for mapping per cpu data like the regular linear mapping -> may be slower on some archs - does not work in real mode, so percpu data can't be used inside exception handlers on some architectures. - memory consumption is rather high when PAGE_SIZE is large Arnd <>< ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-16 15:34 ` Arnd Bergmann @ 2006-04-16 18:03 ` Tony Luck 2006-04-17 0:45 ` Steven Rostedt 2006-04-17 20:06 ` Ravikiran G Thirumalai 2 siblings, 0 replies; 31+ messages in thread From: Tony Luck @ 2006-04-16 18:03 UTC (permalink / raw) To: Arnd Bergmann Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, Paul Mackerras, benedict.gaster, bjornw, Ingo Molnar, Nick Piggin, grundler, rusty, Steven Rostedt, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On 4/16/06, Arnd Bergmann <arnd@arndb.de> wrote: > #define PER_CPU_BASE 0xe000000000000000UL /* arch dependant */ On ia64 the percpu area is at 0xffffffffffff0000 so that it can be addressed without tying up another register (all percpu addresses are small negative offsets from "r0"). When David Mosberger chose this address he said that gcc 4 would actually make ue of this, but I haven't checked the generated code to see whether it really is doing so. -Tony ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-16 15:34 ` Arnd Bergmann 2006-04-16 18:03 ` Tony Luck @ 2006-04-17 0:45 ` Steven Rostedt 2006-04-17 2:07 ` Arnd Bergmann 2006-04-17 20:06 ` Ravikiran G Thirumalai 2 siblings, 1 reply; 31+ messages in thread From: Steven Rostedt @ 2006-04-17 0:45 UTC (permalink / raw) To: Arnd Bergmann Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, Paul Mackerras, benedict.gaster, bjornw, Ingo Molnar, Nick Piggin, grundler, rusty, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Sun, 2006-04-16 at 17:34 +0200, Arnd Bergmann wrote: > On Sunday 16 April 2006 15:40, Steven Rostedt wrote: > > I'll think more about this, but maybe someone else has some crazy ideas > > that can find a solution to this that is both fast and robust. > > Ok, you asked for a crazy idea, you're going to get it ;-) > > You could take a fixed range from the vmalloc area (e.g. 1MB per cpu) > and use that to remap pages on demand when you need per cpu data. > > #define PER_CPU_BASE 0xe000000000000000UL /* arch dependant */ > #define PER_CPU_SHIFT 0x100000UL > #define __per_cpu_offset(__cpu) (PER_CPU_BASE + PER_CPU_STRIDE * (__cpu)) > #define per_cpu(var, cpu) (*RELOC_HIDE(&per_cpu__##var, __per_cpu_offset(cpu))) > #define __get_cpu_var(var) per_cpu(var, smp_processor_id()) > > This is a lot like the current sparc64 implementation already is. > Hmm, interesting idea. > The tricky part here is the remapping of pages. You'd need to > alloc_pages_node() new pages whenever the already reserved space is > not enough for the module you want to load and then map_vm_area() > them into the space reserved for them. > > Advantages of this solution are: > - no dependant load access for per_cpu() > - might be flexible enough to implement a faster per_cpu_ptr() > - can be combined with ia64-style per-cpu remapping > > Disadvantages are: > - you can't use huge tlbs for mapping per cpu data like the > regular linear mapping -> may be slower on some archs > - does not work in real mode, so percpu data can't be used > inside exception handlers on some architectures. This is probably a big issue. I believe interrupt context in hrtimers uses per_cpu variables. > - memory consumption is rather high when PAGE_SIZE is large That's also something that I'm trying to solve. To use the least amount of memory and still have the performance. Now, I've also thought about allocating per_cpu and when a module is loaded, reallocate more memory and copy it again. Use something like the kstopmachine to sync the system so that the CPUS don't update any per_cpu variables while this is happening, so that things can't get out of sync. This shouldn't be too much of an issue, since this would only be done when a module is being loaded, and that is a user event that doesn't happen often. We would still need to use the method of keeping track of what is allocated and freed, so that when a module is unloaded, we can still free the area in the per_cpu data. And reallocate that area if a module is added that uses less or the same amount of memory as what was freed. -- Steve ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-17 0:45 ` Steven Rostedt @ 2006-04-17 2:07 ` Arnd Bergmann 2006-04-17 2:17 ` Steven Rostedt 0 siblings, 1 reply; 31+ messages in thread From: Arnd Bergmann @ 2006-04-17 2:07 UTC (permalink / raw) To: Steven Rostedt Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, Paul Mackerras, benedict.gaster, bjornw, Ingo Molnar, Nick Piggin, grundler, rusty, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux Am Monday 17 April 2006 02:45 schrieb Steven Rostedt: > > - does not work in real mode, so percpu data can't be used > > =C2=A0 inside exception handlers on some architectures. > > This is probably a big issue. =C2=A0I believe interrupt context in hrtime= rs > uses per_cpu variables. If it's just about hrtimers, it should be harmless, since they are run in softirq context. Even regular interrupt handlers are always called with paging enabled, otherwise you could not have them im modules. > > - memory consumption is rather high when PAGE_SIZE is large > > That's also something that I'm trying to solve. =C2=A0To use the least am= ount > of memory and still have the performance. > > Now, I've also thought about allocating per_cpu and when a module is > loaded, reallocate more memory and copy it again. =C2=A0Use something like > the kstopmachine to sync the system so that the CPUS don't update any > per_cpu variables while this is happening, so that things can't get out > of sync. I guess this breaks if someone holds a pointer to a per-cpu variable while a module gets loaded. Arnd <>< ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-17 2:07 ` Arnd Bergmann @ 2006-04-17 2:17 ` Steven Rostedt 0 siblings, 0 replies; 31+ messages in thread From: Steven Rostedt @ 2006-04-17 2:17 UTC (permalink / raw) To: Arnd Bergmann Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, Paul Mackerras, benedict.gaster, bjornw, Ingo Molnar, Nick Piggin, grundler, rusty, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Mon, 17 Apr 2006, Arnd Bergmann wrote: > Am Monday 17 April 2006 02:45 schrieb Steven Rostedt: > > > - does not work in real mode, so percpu data can't be used > > > =C2=A0 inside exception handlers on some architectures. > > > > This is probably a big issue. =C2=A0I believe interrupt context in hrti= mers > > uses per_cpu variables. > > If it's just about hrtimers, it should be harmless, since they > are run in softirq context. Even regular interrupt handlers are > always called with paging enabled, otherwise you could not > have them im modules. Ah, you're right. You said exceptions, I'm thinking interrupts. I was a little confused why it wouldn't work. > > > > - memory consumption is rather high when PAGE_SIZE is large > > > > That's also something that I'm trying to solve. =C2=A0To use the least = amount > > of memory and still have the performance. > > > > Now, I've also thought about allocating per_cpu and when a module is > > loaded, reallocate more memory and copy it again. =C2=A0Use something l= ike > > the kstopmachine to sync the system so that the CPUS don't update any > > per_cpu variables while this is happening, so that things can't get out > > of sync. > > I guess this breaks if someone holds a pointer to a per-cpu variable > while a module gets loaded. > Argh, good point, I didn't think about that. Hmm, this solution is looking harder and harder. Darn, I was really hoping this could be a little better in space savings and robustness. It's starting to seem clearer that Rusty's little hack, may be the best solution. If that's the case, I can at least take comfort in knowing that the time I spent on this is documented in LKML archives, and perhaps can keep others from spending the time too. That said, I haven't quite given up, and may spend a couple more sleepless nights pondering this. -- Steve ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-16 15:34 ` Arnd Bergmann 2006-04-16 18:03 ` Tony Luck 2006-04-17 0:45 ` Steven Rostedt @ 2006-04-17 20:06 ` Ravikiran G Thirumalai 2 siblings, 0 replies; 31+ messages in thread From: Ravikiran G Thirumalai @ 2006-04-17 20:06 UTC (permalink / raw) To: Arnd Bergmann Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Christoph Lameter, Joe Taylor, Andi Kleen, linuxppc-dev, Paul Mackerras, benedict.gaster, bjornw, Ingo Molnar, Nick Piggin, grundler, rusty, Steven Rostedt, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Sun, Apr 16, 2006 at 05:34:18PM +0200, Arnd Bergmann wrote: > On Sunday 16 April 2006 15:40, Steven Rostedt wrote: > > I'll think more about this, but maybe someone else has some crazy ideas > > that can find a solution to this that is both fast and robust. > > Ok, you asked for a crazy idea, you're going to get it ;-) > > You could take a fixed range from the vmalloc area (e.g. 1MB per cpu) > and use that to remap pages on demand when you need per cpu data. > > #define PER_CPU_BASE 0xe000000000000000UL /* arch dependant */ > #define PER_CPU_SHIFT 0x100000UL > #define __per_cpu_offset(__cpu) (PER_CPU_BASE + PER_CPU_STRIDE * (__cpu)) > #define per_cpu(var, cpu) (*RELOC_HIDE(&per_cpu__##var, __per_cpu_offset(cpu))) > #define __get_cpu_var(var) per_cpu(var, smp_processor_id()) > > This is a lot like the current sparc64 implementation already is. > > The tricky part here is the remapping of pages. You'd need to > alloc_pages_node() new pages whenever the already reserved space is > not enough for the module you want to load and then map_vm_area() > them into the space reserved for them. > > Advantages of this solution are: > - no dependant load access for per_cpu() > - might be flexible enough to implement a faster per_cpu_ptr() > - can be combined with ia64-style per-cpu remapping An implemenation similar to one you are mentioning was already proposed sometime back. http://lwn.net/Articles/119532/ The design was also meant to not restrict/limit per-cpu memory being allocated from modules. Maybe it was too early then, and maybe now is the right time, going by the interest in this thread :). IMHO, a new solution should fix both static and dynamic per-cpu allocators, - Avoid possibility of false sharing for dynamically allocated per-CPU data (with current alloc percpu) - work early enough -- if alloc_percpu can work early enough, (we can use that for counters like slab cachep stats which is currently racy; using atomic_t for them would be bad for performance) An extra dereference in Steven's original proposal is bad, (I had done some measurements earlier). My implementation had one less reference compared to static per-cpu allocators, but the performance of both were the same as the __per_cpu_offset table is always cache hot. > > Disadvantages are: > - you can't use huge tlbs for mapping per cpu data like the > regular linear mapping -> may be slower on some archs Yep, we waste a few tlb entries then, which is a bit of concern, but then we might be able to use hugetlbs for blocks of per-cpu data and minimize the impact. Thanks, Kiran ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-16 13:40 ` Steven Rostedt 2006-04-16 14:03 ` Sam Ravnborg 2006-04-16 15:34 ` Arnd Bergmann @ 2006-04-17 6:47 ` Rusty Russell 2006-04-17 11:33 ` Steven Rostedt 2 siblings, 1 reply; 31+ messages in thread From: Rusty Russell @ 2006-04-17 6:47 UTC (permalink / raw) To: Steven Rostedt Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, Paul Mackerras, benedict.gaster, bjornw, Ingo Molnar, Nick Piggin, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Sun, 2006-04-16 at 09:40 -0400, Steven Rostedt wrote: > The reason that this is done, is that the per_cpu macro cant know > whether or not the per_cpu variable was declared in a kernel or in a > module. So the __pre_cpu_offset[] array offset can't be used if the > module allocation is in its own separate area. Remember that this offset > array stores the difference from where the variable originally was and > where it is now for each cpu. Actually, the reason this is done is because the per_cpu_offset[] is designed to be replaced by a register or an expression on archs which care, and this is simple. The main problem is that so many archs want different things, it's a very UN task to build infrastructure. I have always recommended using the same scheme to implement real dynamic per-cpu allocation (which would then replace the mini-allocator inside the module code). In fact, I had such an implementation which I reduced to the module case (dynamic per-cpu was too far-out at the time). The arch would allocate a virtual memory hole for each CPU, and map pages as required (this is the simplest of several potential schemes). This gives the "same space between CPUs" property which is required for the ptr + per-cpu-offset scheme. An arch would supply functions like: /* Returns address of new memory chunk(s) * (add __per_cpu_offset to get virtual addresses). */ unsigned long alloc_percpu_memory(unsigned long *size); /* Set by ia64 to reserve the first chunk for percpu vars * in modules only. #define __MODULE_RESERVE_FIRST_CHUNK And an allocator would work on top of these. I'm glad someone is looking at this again! Rusty. -- ccontrol: http://ozlabs.org/~rusty/ccontrol ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-17 6:47 ` Rusty Russell @ 2006-04-17 11:33 ` Steven Rostedt 0 siblings, 0 replies; 31+ messages in thread From: Steven Rostedt @ 2006-04-17 11:33 UTC (permalink / raw) To: Rusty Russell Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, Paul Mackerras, benedict.gaster, bjornw, Ingo Molnar, Nick Piggin, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Mon, 2006-04-17 at 16:47 +1000, Rusty Russell wrote: > > The arch would allocate a virtual memory hole for each CPU, and map > pages as required (this is the simplest of several potential schemes). > This gives the "same space between CPUs" property which is required for > the ptr + per-cpu-offset scheme. An arch would supply functions like: > > /* Returns address of new memory chunk(s) > * (add __per_cpu_offset to get virtual addresses). */ > unsigned long alloc_percpu_memory(unsigned long *size); > > /* Set by ia64 to reserve the first chunk for percpu vars > * in modules only. > #define __MODULE_RESERVE_FIRST_CHUNK > > And an allocator would work on top of these. > > I'm glad someone is looking at this again! Hi Rusty, thanks for the input. Arnd Bergmann also suggested doing the same thing. I've slept on this thought last night and I'm starting to like it more and more. At least it seems to be a better solution than some of the things that I've come up with. I'll start playing around a little and see what I can do with it. I also need to start doing some other work too, so this might take a month or two to get some results. So hopefully, I'll have another patch set in June or July that will be more acceptable. I'd like to thank all those that responded with ideas and criticisms. It's been very helpful. -- Steve ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-16 3:53 ` Steven Rostedt 2006-04-16 7:02 ` Paul Mackerras @ 2006-04-16 7:06 ` Nick Piggin 2006-04-16 16:06 ` Steven Rostedt 2006-04-17 17:10 ` Andi Kleen 1 sibling, 2 replies; 31+ messages in thread From: Nick Piggin @ 2006-04-16 7:06 UTC (permalink / raw) To: Steven Rostedt Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, paulus, benedict.gaster, bjornw, Ingo Molnar, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux Steven Rostedt wrote: > On Sun, 16 Apr 2006, Nick Piggin wrote: >>Why is your module using so much per-cpu memory, anyway? > > > Wasn't my module anyway. The problem appeared in the -rt patch set, when > tracing was turned on. Some module was affected, and grew it's per_cpu > size by quite a bit. In fact we had to increase PERCPU_ENOUGH_ROOM by up > to something like 300K. Well that's easy then, just configure PERCPU_ENOUGH_ROOM to be larger when tracing is on in the -rt patchset? Or use alloc_percpu for the tracing data? >>I don't think it would have been hard for the original author to make >>it robust... just not both fast and robust. PERCPU_ENOUGH_ROOM seems >>like an ugly hack at first glance, but I'm fairly sure it was a result >>of design choices. > > Yeah, and I discovered the reasons for those choices as I worked on this. > I've put a little more thought into this and still think there's a > solution to not slow things down. > > Since the per_cpu_offset section is still smaller than the > PERCPU_ENOUGH_ROOM and robust, I could still copy it into a per cpu memory > field, and even add the __per_cpu_offset to it. This would still save > quite a bit of space. Well I don't think making it per-cpu would help much (presumably it is not going to be written to very frequently) -- I guess it would be a small advantage on NUMA. The main problem is the extra load in the fastpath. You can't start the next load until the results of the first come back. > So now I'm asking for advice on some ideas that can be a work around to > keep the robustness and speed. > > Is there a way (for archs that support it) to allocate memory in a per cpu > manner. So each CPU would have its own variable table in the memory that > is best of it. Then have a field (like the pda in x86_64) to point to > this section, and use the linker offsets to index and find the per_cpu > variables. > > So this solution still has one more redirection than the current solution > (per_cpu_offset__##var -> __per_cpu_offset -> actual_var where as the > current solution is __per_cpu_offset -> actual_var), but all the loads > would be done from memory that would only be specified for a particular > CPU. > > The generic case would still be the same as the patches I already sent, > but the archs that can support it, can have something like the above. > > Would something like that be acceptible? I still don't understand what the justification is for slowing down this critical bit of infrastructure for something that is only a problem in the -rt patchset, and even then only a problem when tracing is enabled. -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-16 7:06 ` Nick Piggin @ 2006-04-16 16:06 ` Steven Rostedt 2006-04-17 17:10 ` Andi Kleen 1 sibling, 0 replies; 31+ messages in thread From: Steven Rostedt @ 2006-04-16 16:06 UTC (permalink / raw) To: Nick Piggin Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, paulus, benedict.gaster, bjornw, Ingo Molnar, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Sun, 2006-04-16 at 17:06 +1000, Nick Piggin wrote: > Steven Rostedt wrote: > > On Sun, 16 Apr 2006, Nick Piggin wrote: > > >>Why is your module using so much per-cpu memory, anyway? > > > > > > Wasn't my module anyway. The problem appeared in the -rt patch set, when > > tracing was turned on. Some module was affected, and grew it's per_cpu > > size by quite a bit. In fact we had to increase PERCPU_ENOUGH_ROOM by up > > to something like 300K. > > Well that's easy then, just configure PERCPU_ENOUGH_ROOM to be larger > when tracing is on in the -rt patchset? Or use alloc_percpu for the > tracing data? > Yeah, we already know this. The -rt patch was what showed the problem, not the reason I was writing these patches. > >>I don't think it would have been hard for the original author to make > >>it robust... just not both fast and robust. PERCPU_ENOUGH_ROOM seems > >>like an ugly hack at first glance, but I'm fairly sure it was a result > >>of design choices. > > > > Yeah, and I discovered the reasons for those choices as I worked on this. > > I've put a little more thought into this and still think there's a > > solution to not slow things down. > > > > Since the per_cpu_offset section is still smaller than the > > PERCPU_ENOUGH_ROOM and robust, I could still copy it into a per cpu memory > > field, and even add the __per_cpu_offset to it. This would still save > > quite a bit of space. > > Well I don't think making it per-cpu would help much (presumably it > is not going to be written to very frequently) -- I guess it would > be a small advantage on NUMA. The main problem is the extra load in > the fastpath. > > You can't start the next load until the results of the first come > back. Yep, you're right here, and it bothers me too that this slows down performance. > > > So now I'm asking for advice on some ideas that can be a work around to > > keep the robustness and speed. > > > > Is there a way (for archs that support it) to allocate memory in a per cpu > > manner. So each CPU would have its own variable table in the memory that > > is best of it. Then have a field (like the pda in x86_64) to point to > > this section, and use the linker offsets to index and find the per_cpu > > variables. > > > > So this solution still has one more redirection than the current solution > > (per_cpu_offset__##var -> __per_cpu_offset -> actual_var where as the > > current solution is __per_cpu_offset -> actual_var), but all the loads > > would be done from memory that would only be specified for a particular > > CPU. > > > > The generic case would still be the same as the patches I already sent, > > but the archs that can support it, can have something like the above. > > > > Would something like that be acceptible? > > I still don't understand what the justification is for slowing down > this critical bit of infrastructure for something that is only a > problem in the -rt patchset, and even then only a problem when tracing > is enabled. > It's because I'm anal retentive :-) I noticed that the current solution is somewhat a hack, and thought maybe it could be done cleaner. Perhaps I'm wrong and the hack _is_ the best solution, but it doesn't hurt in trying to improve it. Or the very least, prove that the current solution is the way to go. I'm not trying to solve an issue with the -rt patch and tracing, I'm just trying to make Linux a little more efficient in saving space. And you may be right that we cant do that without hurting performance, and thus we keep things as is. But I don't want to give up without a fight and miss something that can solve all this and keep Linux the best OS on the market! (not to say that it isn't even with the current solution) -- Steve ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-16 7:06 ` Nick Piggin 2006-04-16 16:06 ` Steven Rostedt @ 2006-04-17 17:10 ` Andi Kleen 1 sibling, 0 replies; 31+ messages in thread From: Andi Kleen @ 2006-04-17 17:10 UTC (permalink / raw) To: Nick Piggin Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, linuxppc-dev, paulus, benedict.gaster, bjornw, Ingo Molnar, grundler, Steven Rostedt, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Sunday 16 April 2006 09:06, Nick Piggin wrote: > I still don't understand what the justification is for slowing down > this critical bit of infrastructure for something that is only a > problem in the -rt patchset, and even then only a problem when tracing > is enabled. There are actually problems outside -rt. e.g. the Xen kernel was running into a near overflow and as more and more code is using per cpu variables others might too. I'm confident the problem can be solved without adding more variables though - e.g. in the way rusty proposed. -Andi ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-15 5:32 ` [PATCH 00/05] robust per_cpu allocation for modules Nick Piggin 2006-04-15 20:17 ` Steven Rostedt @ 2006-04-17 16:55 ` Christoph Lameter 2006-04-17 22:02 ` Ravikiran G Thirumalai 1 sibling, 1 reply; 31+ messages in thread From: Christoph Lameter @ 2006-04-17 16:55 UTC (permalink / raw) To: kiran Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, paulus, benedict.gaster, bjornw, Ingo Molnar, Nick Piggin, grundler, Steven Rostedt, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Sat, 15 Apr 2006, Nick Piggin wrote: > If I'm following you correctly, this adds another dependent load > to a per-CPU data access, and from memory that isn't node-affine. I am also concerned about that. Kiran has a patch to avoid allocpercpu having to go through one level of indirection that I guess would no longer work with this scheme. > If so, I think people with SMP and NUMA kernels would care more > about performance and scalability than the few k of memory this > saves. Right. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-17 16:55 ` Christoph Lameter @ 2006-04-17 22:02 ` Ravikiran G Thirumalai 2006-04-17 23:44 ` Steven Rostedt 0 siblings, 1 reply; 31+ messages in thread From: Ravikiran G Thirumalai @ 2006-04-17 22:02 UTC (permalink / raw) To: Christoph Lameter Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, paulus, benedict.gaster, bjornw, Ingo Molnar, Nick Piggin, grundler, Steven Rostedt, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Mon, Apr 17, 2006 at 09:55:02AM -0700, Christoph Lameter wrote: > On Sat, 15 Apr 2006, Nick Piggin wrote: > > > If I'm following you correctly, this adds another dependent load > > to a per-CPU data access, and from memory that isn't node-affine. > > I am also concerned about that. Kiran has a patch to avoid allocpercpu > having to go through one level of indirection that I guess would no > longer work with this scheme. The alloc_percpu reimplementation would work regardless of changes to static per-cpu areas. But, any extra indirection as was proposed initially is bad IMHO. > > > If so, I think people with SMP and NUMA kernels would care more > > about performance and scalability than the few k of memory this > > saves. > > Right. Me too :) Kiran ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-17 22:02 ` Ravikiran G Thirumalai @ 2006-04-17 23:44 ` Steven Rostedt 2006-04-17 23:48 ` Christoph Lameter 2006-04-18 6:42 ` Nick Piggin 0 siblings, 2 replies; 31+ messages in thread From: Steven Rostedt @ 2006-04-17 23:44 UTC (permalink / raw) To: Ravikiran G Thirumalai Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, paulus, benedict.gaster, bjornw, Ingo Molnar, Christoph Lameter, Nick Piggin, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Mon, 17 Apr 2006, Ravikiran G Thirumalai wrote: > On Mon, Apr 17, 2006 at 09:55:02AM -0700, Christoph Lameter wrote: > > On Sat, 15 Apr 2006, Nick Piggin wrote: > > > > > If I'm following you correctly, this adds another dependent load > > > to a per-CPU data access, and from memory that isn't node-affine. > > > > I am also concerned about that. Kiran has a patch to avoid allocpercpu > > having to go through one level of indirection that I guess would no > > longer work with this scheme. > > The alloc_percpu reimplementation would work regardless of changes to > static per-cpu areas. But, any extra indirection as was proposed initially > is bad IMHO. > Don't worry, that idea has been shot down more than once ;-) > > > > > If so, I think people with SMP and NUMA kernels would care more > > > about performance and scalability than the few k of memory this > > > saves. > > > > Right. > > Me too :) > Understood, but I'm going to start looking in the way Rusty and Arnd suggested with the vmalloc approach. This would allow for saving of memory and dynamic allocation of module memory making it more robust. And all this without that evil extra indirection! So lets put my original patches where they belong, in the bit grave and continue on. I lived, I learned and I've been shown the Way (thanks to all BTW). So now we can focus on a better solution. Cheers, -- Steve ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-17 23:44 ` Steven Rostedt @ 2006-04-17 23:48 ` Christoph Lameter 2006-04-18 1:51 ` Steven Rostedt 2006-04-18 6:42 ` Nick Piggin 1 sibling, 1 reply; 31+ messages in thread From: Christoph Lameter @ 2006-04-17 23:48 UTC (permalink / raw) To: Steven Rostedt Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, paulus, benedict.gaster, bjornw, Ingo Molnar, Ravikiran G Thirumalai, Nick Piggin, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Mon, 17 Apr 2006, Steven Rostedt wrote: > So now we can focus on a better solution. Could you have a look at Kiran's work? Maybe one result of your work could be that the existing indirection for alloc_percpu could be avoided? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-17 23:48 ` Christoph Lameter @ 2006-04-18 1:51 ` Steven Rostedt 0 siblings, 0 replies; 31+ messages in thread From: Steven Rostedt @ 2006-04-18 1:51 UTC (permalink / raw) To: Christoph Lameter Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, paulus, benedict.gaster, bjornw, Ingo Molnar, Ravikiran G Thirumalai, Nick Piggin, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux On Mon, 2006-04-17 at 16:48 -0700, Christoph Lameter wrote: > On Mon, 17 Apr 2006, Steven Rostedt wrote: > > > So now we can focus on a better solution. > > Could you have a look at Kiran's work? > > Maybe one result of your work could be that the existing indirection > for alloc_percpu could be avoided? Sure, I'll spend some time looking at what others have done and see what I can put together. I'm also very busy on other stuff at the moment, so this will be something I do more on the side. Don't think there's a rush here, but I stated in a previous post, I probably wont have something out for a month or two. -- Steve ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-17 23:44 ` Steven Rostedt 2006-04-17 23:48 ` Christoph Lameter @ 2006-04-18 6:42 ` Nick Piggin 2006-04-18 12:47 ` Steven Rostedt 1 sibling, 1 reply; 31+ messages in thread From: Nick Piggin @ 2006-04-18 6:42 UTC (permalink / raw) To: Steven Rostedt Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, paulus, benedict.gaster, bjornw, Ingo Molnar, Ravikiran G Thirumalai, Christoph Lameter, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux Steven Rostedt wrote: > Understood, but I'm going to start looking in the way Rusty and Arnd > suggested with the vmalloc approach. This would allow for saving of > memory and dynamic allocation of module memory making it more robust. And > all this without that evil extra indirection! Remember that this approach could effectively just move the indirection to the TLB / page tables (well, I say "moves" because large kernel mappings are effectively free compared with 4K mappings). So be careful about coding up a large amount of work before unleashing it: I doubt you'll be able to find a solution that doesn't involve tradeoffs somewhere (but wohoo if you can). -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-18 6:42 ` Nick Piggin @ 2006-04-18 12:47 ` Steven Rostedt 0 siblings, 0 replies; 31+ messages in thread From: Steven Rostedt @ 2006-04-18 12:47 UTC (permalink / raw) To: Nick Piggin Cc: Andrew Morton, linux-mips, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, paulus, bjornw, Ingo Molnar, Ravikiran G Thirumalai, Christoph Lameter, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, Chris Zankel, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux [Removed from CC davidm@hpl.hp.com and benedict.gaster@superh.com because I keep getting "unknown user" bounces from them] On Tue, 2006-04-18 at 16:42 +1000, Nick Piggin wrote: > Steven Rostedt wrote: > > > Understood, but I'm going to start looking in the way Rusty and Arnd > > suggested with the vmalloc approach. This would allow for saving of > > memory and dynamic allocation of module memory making it more robust. And > > all this without that evil extra indirection! > > Remember that this approach could effectively just move the indirection to > the TLB / page tables (well, I say "moves" because large kernel mappings > are effectively free compared with 4K mappings). Yeah, I thought about the paging latencies when it was first mentioned. And this is something that's going to be very hard to know the impact, because it will be different on every system. > > So be careful about coding up a large amount of work before unleashing it: > I doubt you'll be able to find a solution that doesn't involve tradeoffs > somewhere (but wohoo if you can). > OK, but as I mentioned that this is now more of a side project, so a month of work is not really going to be a month of work ;) I'll first try to get something that just "works" and then post an RFC PATCH set, to get more ideas. Since obviously there's a lot of people out there that know their systems much better than I do ;) Thanks, -- Steve ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/05] robust per_cpu allocation for modules 2006-04-14 21:18 [PATCH 00/05] robust per_cpu allocation for modules Steven Rostedt ` (3 preceding siblings ...) 2006-04-15 5:32 ` [PATCH 00/05] robust per_cpu allocation for modules Nick Piggin @ 2006-04-16 6:35 ` Paul Mackerras 4 siblings, 0 replies; 31+ messages in thread From: Paul Mackerras @ 2006-04-16 6:35 UTC (permalink / raw) To: Steven Rostedt Cc: Andrew Morton, linux-mips, David Mosberger-Tang, linux-ia64, Martin Mares, spyro, Joe Taylor, Andi Kleen, linuxppc-dev, benedict.gaster, bjornw, Ingo Molnar, grundler, starvik, Linus Torvalds, Thomas Gleixner, rth, chris, tony.luck, LKML, ralf, Marc Gauthier, lethal, schwidefsky, linux390, davem, parisc-linux Steven Rostedt writes: > The data in .data.percpu_offset holds is referenced by the per_cpu > variable name which points to the __per_cpu_offset array. For modules, > it will point to the per_cpu_offset array of the module. > > Example: > > DEFINE_PER_CPU(int, myint); > > would now create a variable called per_cpu_offset__myint in > the .data.percpu_offset section. This variable will point to the (if > defined in the kernel) __per_cpu_offset[] array. If this was a module > variable, it would point to the module per_cpu_offset[] array which is > created when the modules is loaded. This sounds like you have an extra memory reference each time a per-cpu variable is accessed. Have you tried to measure the performance impact of that? If so, how much performance does it lose? Paul. ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2006-04-18 12:48 UTC | newest] Thread overview: 31+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-04-14 21:18 [PATCH 00/05] robust per_cpu allocation for modules Steven Rostedt 2006-04-14 22:06 ` Andrew Morton 2006-04-14 22:12 ` Steven Rostedt 2006-04-14 22:12 ` Chen, Kenneth W 2006-04-15 3:10 ` [PATCH 00/08] robust per_cpu allocation for modules - V2 Steven Rostedt 2006-04-15 5:32 ` [PATCH 00/05] robust per_cpu allocation for modules Nick Piggin 2006-04-15 20:17 ` Steven Rostedt 2006-04-16 2:47 ` Nick Piggin 2006-04-16 3:53 ` Steven Rostedt 2006-04-16 7:02 ` Paul Mackerras 2006-04-16 13:40 ` Steven Rostedt 2006-04-16 14:03 ` Sam Ravnborg 2006-04-16 15:34 ` Arnd Bergmann 2006-04-16 18:03 ` Tony Luck 2006-04-17 0:45 ` Steven Rostedt 2006-04-17 2:07 ` Arnd Bergmann 2006-04-17 2:17 ` Steven Rostedt 2006-04-17 20:06 ` Ravikiran G Thirumalai 2006-04-17 6:47 ` Rusty Russell 2006-04-17 11:33 ` Steven Rostedt 2006-04-16 7:06 ` Nick Piggin 2006-04-16 16:06 ` Steven Rostedt 2006-04-17 17:10 ` Andi Kleen 2006-04-17 16:55 ` Christoph Lameter 2006-04-17 22:02 ` Ravikiran G Thirumalai 2006-04-17 23:44 ` Steven Rostedt 2006-04-17 23:48 ` Christoph Lameter 2006-04-18 1:51 ` Steven Rostedt 2006-04-18 6:42 ` Nick Piggin 2006-04-18 12:47 ` Steven Rostedt 2006-04-16 6:35 ` Paul Mackerras
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).