From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dirk Behme Date: Sun, 19 Dec 2010 08:51:48 +0100 Subject: [U-Boot] [RFC PATCH v2] ARM: Avoid compiler optimization for usages of readb, writeb and friends. In-Reply-To: <1292711230-3234-1-git-send-email-holler@ahsoftware.de> References: <1292711230-3234-1-git-send-email-holler@ahsoftware.de> Message-ID: <4D0DB994.1020703@googlemail.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de On 18.12.2010 23:27, Alexander Holler wrote: > gcc 4.5.1 seems to ignore (at least some) volatile definitions, > avoid that as done in the kernel. > > Reading C99 6.7.3 8 and the comment 114) there, I think it is a bug of that > gcc version to ignore the volatile type qualifier used e.g. in __arch_getl(). > Anyway, using a definition as in the kernel headers avoids such optimizations when > gcc 4.5.1 is used. > > Maybe the headers as used in the current linux-kernel should be used, > but to avoid large changes, I've just added a small change to the current headers. > > Signed-off-by: Alexander Holler > --- > arch/arm/include/asm/io.h | 20 ++++++++++++++------ > 1 files changed, 14 insertions(+), 6 deletions(-) > > diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h > index ff1518e..5364b78 100644 > --- a/arch/arm/include/asm/io.h > +++ b/arch/arm/include/asm/io.h > @@ -125,13 +125,21 @@ extern inline void __raw_readsl(unsigned int addr, void *data, int longlen) > #define __raw_readw(a) __arch_getw(a) > #define __raw_readl(a) __arch_getl(a) > > -#define writeb(v,a) __arch_putb(v,a) > -#define writew(v,a) __arch_putw(v,a) > -#define writel(v,a) __arch_putl(v,a) > +/* > + * TODO: The kernel offers some more advanced versions of barriers, it might > + * have some advantages to use them instead of the simple one here. > + */ > +#define dmb() __asm__ __volatile__ ("" : : : "memory") > +#define __iormb() dmb() > +#define __iowmb() dmb() > + > +#define writeb(v,c) ({ __iowmb(); __arch_putb(v,c); }) > +#define writew(v,c) ({ __iowmb(); __arch_putw(v,c); }) > +#define writel(v,c) ({ __iowmb(); __arch_putl(v,c); }) > > -#define readb(a) __arch_getb(a) > -#define readw(a) __arch_getw(a) > -#define readl(a) __arch_getl(a) > +#define readb(c) ({ u8 __v = __arch_getb(c); __iormb(); __v; }) > +#define readw(c) ({ u16 __v = __arch_getw(c); __iormb(); __v; }) > +#define readl(c) ({ u32 __v = __arch_getl(c); __iormb(); __v; }) Using the test code below [1] and then looking at the disassembly from the two tool chains gcc version 4.3.3 (Sourcery G++ Lite 2009q1-203) versus gcc version 4.5.1 (Sourcery G++ Lite 2010.09-50): Yes, without the additional dmb() the gcc 4.5.1 just creates 00000000
: 0: e3a00000 mov r0, #0 4: e12fff1e bx lr while with the additional dmb() it creates 00000000
: 0: e59f300c ldr r3, [pc, #12] ; 14 4: e5932028 ldr r2, [r3, #40] ; 0x28 8: e5930028 ldr r0, [r3, #40] ; 0x28 c: e0620000 rsb r0, r2, r0 10: e12fff1e bx lr 14: 48318000 what looks correct. And 4.3.3 does the same code for both readl() versions. So: Acked-by: Dirk Behme Thanks Dirk [1] arm-none-linux-gnueabi-gcc -Wall -O2 -c foo.c -o foo.o arm-none-linux-gnueabi-objdump -D foo.o > foo.dis -- foo.c -- struct gptimer { unsigned int tidr; /* 0x00 r */ unsigned char res[0xc]; unsigned int tiocp_cfg; /* 0x10 rw */ unsigned int tistat; /* 0x14 r */ unsigned int tisr; /* 0x18 rw */ unsigned int tier; /* 0x1c rw */ unsigned int twer; /* 0x20 rw */ unsigned int tclr; /* 0x24 rw */ unsigned int tcrr; /* 0x28 rw */ unsigned int tldr; /* 0x2c rw */ unsigned int ttgr; /* 0x30 rw */ unsigned int twpc; /* 0x34 r*/ unsigned int tmar; /* 0x38 rw*/ unsigned int tcar1; /* 0x3c r */ unsigned int tcicr; /* 0x40 rw */ unsigned int tcar2; /* 0x44 r */ }; #define dmb() __asm__ __volatile__ ("" : : : "memory") #define __iormb() dmb() #define __arch_getl(a) (*(volatile unsigned int *)(a)) #define readl(a) __arch_getl(a) //#define readl(c) ({ unsigned int __v = __arch_getl(c); __iormb(); __v; }) int main(void) { struct gptimer *gpt1_base = (struct gptimer *)0x48318000; unsigned int cdiff, cstart, cend; cstart = readl(&gpt1_base->tcrr); cend = readl(&gpt1_base->tcrr); cdiff = cend - cstart; return cdiff; } -- foo.c --