linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] ARM: vmlinux.lds.S: do not hardcode cacheline size as 32 bytes
@ 2011-12-13 18:06 Will Deacon
  2011-12-15 19:00 ` Stephen Boyd
  0 siblings, 1 reply; 3+ messages in thread
From: Will Deacon @ 2011-12-13 18:06 UTC (permalink / raw)
  To: linux-arm-kernel

The linker script assumes a cacheline size of 32 bytes when aligning
the .data..cacheline_aligned and .data..percpu sections.

This patch updates the script to use L1_CACHE_BYTES, which should be set
to 64 on platforms that require it.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---

I'm posting this as an RFC because, whilst this fixes a bug, it looks
like many platforms don't select ARM_L1_CACHE_SHIFT_6 when they should
(all Cortex-A8 platforms should select this, for example).

I'd be happy to select ARM_L1_CACHE_SHIFT_6 if CPU_V7, but this doesn't
help us for combined v6/v7 kernels...

Answers on a postcard (although email is preferable),

Will

 arch/arm/kernel/vmlinux.lds.S |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 20b3041..98067b7 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -4,6 +4,7 @@
  */
 
 #include <asm-generic/vmlinux.lds.h>
+#include <asm/cache.h>
 #include <asm/thread_info.h>
 #include <asm/memory.h>
 #include <asm/page.h>
@@ -174,7 +175,7 @@ SECTIONS
 	}
 #endif
 
-	PERCPU_SECTION(32)
+	PERCPU_SECTION(L1_CACHE_BYTES)
 
 #ifdef CONFIG_XIP_KERNEL
 	__data_loc = ALIGN(4);		/* location in binary */
@@ -205,7 +206,7 @@ SECTIONS
 #endif
 
 		NOSAVE_DATA
-		CACHELINE_ALIGNED_DATA(32)
+		CACHELINE_ALIGNED_DATA(L1_CACHE_BYTES)
 		READ_MOSTLY_DATA(32)
 
 		/*
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [RFC PATCH] ARM: vmlinux.lds.S: do not hardcode cacheline size as 32 bytes
  2011-12-13 18:06 [RFC PATCH] ARM: vmlinux.lds.S: do not hardcode cacheline size as 32 bytes Will Deacon
@ 2011-12-15 19:00 ` Stephen Boyd
  2011-12-15 23:22   ` Will Deacon
  0 siblings, 1 reply; 3+ messages in thread
From: Stephen Boyd @ 2011-12-15 19:00 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/13/11 10:06, Will Deacon wrote:
> The linker script assumes a cacheline size of 32 bytes when aligning
> the .data..cacheline_aligned and .data..percpu sections.
>
> This patch updates the script to use L1_CACHE_BYTES, which should be set
> to 64 on platforms that require it.
>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>
> I'm posting this as an RFC because, whilst this fixes a bug, it looks
> like many platforms don't select ARM_L1_CACHE_SHIFT_6 when they should
> (all Cortex-A8 platforms should select this, for example).

What are the implications of not having cache aligned data? Is it a
performance impact or something more?

> @@ -205,7 +206,7 @@ SECTIONS
>  #endif
>  
>  		NOSAVE_DATA
> -		CACHELINE_ALIGNED_DATA(32)
> +		CACHELINE_ALIGNED_DATA(L1_CACHE_BYTES)
>  		READ_MOSTLY_DATA(32)

Does READ_MOSTLY_DATA also need to be cache aligned? At least powerpc is
doing that.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [RFC PATCH] ARM: vmlinux.lds.S: do not hardcode cacheline size as 32 bytes
  2011-12-15 19:00 ` Stephen Boyd
@ 2011-12-15 23:22   ` Will Deacon
  0 siblings, 0 replies; 3+ messages in thread
From: Will Deacon @ 2011-12-15 23:22 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Stephen,

On Thu, Dec 15, 2011 at 07:00:41PM +0000, Stephen Boyd wrote:
> On 12/13/11 10:06, Will Deacon wrote:
> > The linker script assumes a cacheline size of 32 bytes when aligning
> > the .data..cacheline_aligned and .data..percpu sections.
> >
> > This patch updates the script to use L1_CACHE_BYTES, which should be set
> > to 64 on platforms that require it.
> >
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
> > ---
> >
> > I'm posting this as an RFC because, whilst this fixes a bug, it looks
> > like many platforms don't select ARM_L1_CACHE_SHIFT_6 when they should
> > (all Cortex-A8 platforms should select this, for example).
> 
> What are the implications of not having cache aligned data? Is it a
> performance impact or something more?

It's used both for performance reasons but also for correctness. For
example, I think that having the line size too small could cause possible
misalignment of streaming DMA buffers (see ARCH_DMA_MINALIGN), which could
lead to data corruption when we invalidate adjacent data on speculating
CPUs.

> > @@ -205,7 +206,7 @@ SECTIONS
> >  #endif
> >  
> >  		NOSAVE_DATA
> > -		CACHELINE_ALIGNED_DATA(32)
> > +		CACHELINE_ALIGNED_DATA(L1_CACHE_BYTES)
> >  		READ_MOSTLY_DATA(32)
> 
> Does READ_MOSTLY_DATA also need to be cache aligned? At least powerpc is
> doing that.

This is in the optimistion camp but it's probably a good idea since it will
help to keep read-mostly data cachelines in the shared state on SMP systems.

The main issue I have with all of this is the lack of platforms selecting
the correct shift. Defaulting to 6 would be my preference, but it's hard to
tell what to predicate this on (CPU_V6 || CPU_V7 ?).

Will

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-12-15 23:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-13 18:06 [RFC PATCH] ARM: vmlinux.lds.S: do not hardcode cacheline size as 32 bytes Will Deacon
2011-12-15 19:00 ` Stephen Boyd
2011-12-15 23:22   ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).