* [RFC PATCH] ARM: vmlinux.lds.S: do not hardcode cacheline size as 32 bytes
@ 2011-12-13 18:06 Will Deacon
2011-12-15 19:00 ` Stephen Boyd
0 siblings, 1 reply; 3+ messages in thread
From: Will Deacon @ 2011-12-13 18:06 UTC (permalink / raw)
To: linux-arm-kernel
The linker script assumes a cacheline size of 32 bytes when aligning
the .data..cacheline_aligned and .data..percpu sections.
This patch updates the script to use L1_CACHE_BYTES, which should be set
to 64 on platforms that require it.
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
I'm posting this as an RFC because, whilst this fixes a bug, it looks
like many platforms don't select ARM_L1_CACHE_SHIFT_6 when they should
(all Cortex-A8 platforms should select this, for example).
I'd be happy to select ARM_L1_CACHE_SHIFT_6 if CPU_V7, but this doesn't
help us for combined v6/v7 kernels...
Answers on a postcard (although email is preferable),
Will
arch/arm/kernel/vmlinux.lds.S | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 20b3041..98067b7 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -4,6 +4,7 @@
*/
#include <asm-generic/vmlinux.lds.h>
+#include <asm/cache.h>
#include <asm/thread_info.h>
#include <asm/memory.h>
#include <asm/page.h>
@@ -174,7 +175,7 @@ SECTIONS
}
#endif
- PERCPU_SECTION(32)
+ PERCPU_SECTION(L1_CACHE_BYTES)
#ifdef CONFIG_XIP_KERNEL
__data_loc = ALIGN(4); /* location in binary */
@@ -205,7 +206,7 @@ SECTIONS
#endif
NOSAVE_DATA
- CACHELINE_ALIGNED_DATA(32)
+ CACHELINE_ALIGNED_DATA(L1_CACHE_BYTES)
READ_MOSTLY_DATA(32)
/*
--
1.7.4.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [RFC PATCH] ARM: vmlinux.lds.S: do not hardcode cacheline size as 32 bytes
2011-12-13 18:06 [RFC PATCH] ARM: vmlinux.lds.S: do not hardcode cacheline size as 32 bytes Will Deacon
@ 2011-12-15 19:00 ` Stephen Boyd
2011-12-15 23:22 ` Will Deacon
0 siblings, 1 reply; 3+ messages in thread
From: Stephen Boyd @ 2011-12-15 19:00 UTC (permalink / raw)
To: linux-arm-kernel
On 12/13/11 10:06, Will Deacon wrote:
> The linker script assumes a cacheline size of 32 bytes when aligning
> the .data..cacheline_aligned and .data..percpu sections.
>
> This patch updates the script to use L1_CACHE_BYTES, which should be set
> to 64 on platforms that require it.
>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>
> I'm posting this as an RFC because, whilst this fixes a bug, it looks
> like many platforms don't select ARM_L1_CACHE_SHIFT_6 when they should
> (all Cortex-A8 platforms should select this, for example).
What are the implications of not having cache aligned data? Is it a
performance impact or something more?
> @@ -205,7 +206,7 @@ SECTIONS
> #endif
>
> NOSAVE_DATA
> - CACHELINE_ALIGNED_DATA(32)
> + CACHELINE_ALIGNED_DATA(L1_CACHE_BYTES)
> READ_MOSTLY_DATA(32)
Does READ_MOSTLY_DATA also need to be cache aligned? At least powerpc is
doing that.
--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [RFC PATCH] ARM: vmlinux.lds.S: do not hardcode cacheline size as 32 bytes
2011-12-15 19:00 ` Stephen Boyd
@ 2011-12-15 23:22 ` Will Deacon
0 siblings, 0 replies; 3+ messages in thread
From: Will Deacon @ 2011-12-15 23:22 UTC (permalink / raw)
To: linux-arm-kernel
Hi Stephen,
On Thu, Dec 15, 2011 at 07:00:41PM +0000, Stephen Boyd wrote:
> On 12/13/11 10:06, Will Deacon wrote:
> > The linker script assumes a cacheline size of 32 bytes when aligning
> > the .data..cacheline_aligned and .data..percpu sections.
> >
> > This patch updates the script to use L1_CACHE_BYTES, which should be set
> > to 64 on platforms that require it.
> >
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
> > ---
> >
> > I'm posting this as an RFC because, whilst this fixes a bug, it looks
> > like many platforms don't select ARM_L1_CACHE_SHIFT_6 when they should
> > (all Cortex-A8 platforms should select this, for example).
>
> What are the implications of not having cache aligned data? Is it a
> performance impact or something more?
It's used both for performance reasons but also for correctness. For
example, I think that having the line size too small could cause possible
misalignment of streaming DMA buffers (see ARCH_DMA_MINALIGN), which could
lead to data corruption when we invalidate adjacent data on speculating
CPUs.
> > @@ -205,7 +206,7 @@ SECTIONS
> > #endif
> >
> > NOSAVE_DATA
> > - CACHELINE_ALIGNED_DATA(32)
> > + CACHELINE_ALIGNED_DATA(L1_CACHE_BYTES)
> > READ_MOSTLY_DATA(32)
>
> Does READ_MOSTLY_DATA also need to be cache aligned? At least powerpc is
> doing that.
This is in the optimistion camp but it's probably a good idea since it will
help to keep read-mostly data cachelines in the shared state on SMP systems.
The main issue I have with all of this is the lack of platforms selecting
the correct shift. Defaulting to 6 would be my preference, but it's hard to
tell what to predicate this on (CPU_V6 || CPU_V7 ?).
Will
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-12-15 23:22 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-13 18:06 [RFC PATCH] ARM: vmlinux.lds.S: do not hardcode cacheline size as 32 bytes Will Deacon
2011-12-15 19:00 ` Stephen Boyd
2011-12-15 23:22 ` Will Deacon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).