From mboxrd@z Thu Jan 1 00:00:00 1970 From: Allen Martin Date: Mon, 9 Jul 2012 17:45:35 -0700 Subject: [U-Boot] [PATCH 2/7] HACK: rearrange link order for thumb In-Reply-To: <20120707121536.4cdf7f35@lilith> References: <1341598142-28873-1-git-send-email-amartin@nvidia.com> <1341598142-28873-3-git-send-email-amartin@nvidia.com> <4FF737F7.8040008@wwwdotorg.org> <20120706203329.GA29103@nvidia.com> <4FF74E30.3000101@wwwdotorg.org> <20120706231719.GE29103@nvidia.com> <20120707121536.4cdf7f35@lilith> Message-ID: <20120710004535.GA2240@nvidia.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de On Sat, Jul 07, 2012 at 03:15:36AM -0700, Albert ARIBAUD wrote: > Hi Allen, > > On Fri, 6 Jul 2012 16:17:19 -0700, Allen Martin wrote: > > On Fri, Jul 06, 2012 at 01:44:32PM -0700, Stephen Warren wrote: > > > On 07/06/2012 02:33 PM, Allen Martin wrote: > > > > On Fri, Jul 06, 2012 at 12:09:43PM -0700, Stephen Warren wrote: > > > >> On 07/06/2012 12:08 PM, Allen Martin wrote: > > > >>> Rearrange the link order of libraries to avoid out of bound > > > >>> relocations in thumb mode. I have no idea how to fix this for real. > > > >> > > > >> Are the relocations branches or something else? It looks like > > > >> unconditional jump range is +/-4MB for Thumb1 and +/-16MB for Thumb2, so > > > >> I'm surprised we'd be exceeding that, considering the U-boot binary is > > > >> on the order of 256KB on Tegra right now. > > > > > > > > > > > > This is the relcation type: > > > > > > > > arch/arm/lib/libarm.o: In function `__flush_dcache_all': > > > > /home/arm/u-boot/arch/arm/lib/cache.c:52: relocation truncated to fit: R_ARM_THM_JUMP11 against symbol `flush_cache' defined in .text section in arch/arm/cpu/armv7/libarmv7.o > > > > > > > > The instruction is a "b.n" not a "b", which is what is causing the problem. > > > > > > > > I think because of the weak alias the compiler used a short jump to > > > > the local function, but when it got linked it resolved to a function > > > > that was too far away for the short jump: > > > > > > > > > > > > void flush_cache(unsigned long start, unsigned long size) > > > > __attribute__((weak, alias("__flush_cache"))); > > > > > > > > 00000002 <__flush_dcache_all>: > > > > 2: 2000 movs r0, #0 > > > > 4: f04f 31ff mov.w r1, #4294967295 ; 0xffffffff > > > > 8: e7fe b.n 0 <__flush_cache> > > > > > > Ah, that explanation makes sense. > > > > > > > It looks like there's a "-fno-optimize-sibling-calls" option to gcc to > > > > avoid this problem. Seems a shame to disable all short jumps for this > > > > one case though. > > > > > > It seems like a bug that the b-vs-b.n optimization is applied to a weak > > > symbol, since the compiler can't possibly know the range of the jump. > > > > > > Also, I've seen ld for some architectures rewrite the equivalent of b.n > > > to plain b when needing to expand the branch target range; IIRC a > > > process known as "relaxing"? Perhaps gcc is expecting ld to do that, but > > > ld isn't? > > > > And I forgot to mention, the code bloat from disabling the > > optimization is about 400 bytes (185136 -> 185540), so it's not bad, > > but it still seems a shame to disable all short branches because of > > one misoptimized one. > > Can this not be limited to compiling the object files which are known to be > sensitive to the problem? > I understand this issue fairly well now. It's a known bug in the assembler that has already been fixed: http://sourceware.org/bugzilla/show_bug.cgi?id=12532 It only impacts preembtable symbols, and since u-boot doesn't have any dynamic loadable objects it's only explictly defined weak symbols that should trigger the bug. I built a new toolchain with binutils 2.22 and verified the bug is no longer present there, and -fno-optimize-sibling-calls is the correct workaround for toolchains that do have the bug, so conditionally disabling the optimization for binutils < 2.22 seems like the right fix. I ran a quick scrub of the u-boot tree and there's 195 instances of __attribute__((weak)) spread across 123 source files, so I think just disabling optimization on the failing object files may be too fragile, as code movement could cause others to crop up. -Allen -- nvpublic