From mboxrd@z Thu Jan 1 00:00:00 1970 From: Przemyslaw Marczak Date: Mon, 16 Feb 2015 16:21:19 +0100 Subject: [U-Boot] [PATCH v2 2/8] arm: relocation: clear .bss section with arch memset if defined In-Reply-To: <1424099601-14979-3-git-send-email-p.marczak@samsung.com> References: <1424099601-14979-1-git-send-email-p.marczak@samsung.com> <1424099601-14979-3-git-send-email-p.marczak@samsung.com> Message-ID: <54E20AEF.1080904@samsung.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de Hello, On 02/16/2015 04:13 PM, Przemyslaw Marczak wrote: > For ARM architecture, enable the CONFIG_USE_ARCH_MEMSET/MEMCPY, > will highly increase the memset/memcpy performance. This is able > thanks to the ARM multiple register instructions. > > Unfortunatelly the relocation is done without the cache enabled, > so it takes some time, but zeroing the BSS memory takes much more > longer, especially for the configs with big static buffers. > > A quick test confirms, that the boot time improvement after using > the arch memcpy for relocation has no significant meaning. > The same test confirms that enable the memset for zeroing BSS, > reduces the boot time. > > So this patch enables the arch memset for zeroing the BSS after > the relocation process. For ARM boards, this can be enabled > in board configs by defining: 'CONFIG_USE_ARCH_MEMSET'. > > This was tested on Trats2. > A quick test with trace. Boot time from start to main_loop() entry: > - ~1384ms - before this change > - ~888ms - after this change > > Signed-off-by: Przemyslaw Marczak > Cc: Albert Aribaud > Cc: Tom Rini > --- > arch/arm/lib/crt0.S | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/arch/arm/lib/crt0.S b/arch/arm/lib/crt0.S > index 22df3e5..fab3d2c 100644 > --- a/arch/arm/lib/crt0.S > +++ b/arch/arm/lib/crt0.S > @@ -115,14 +115,22 @@ here: > bl c_runtime_cpu_setup /* we still call old routine here */ > > ldr r0, =__bss_start /* this is auto-relocated! */ > - ldr r1, =__bss_end /* this is auto-relocated! */ > > +#ifdef CONFIG_USE_ARCH_MEMSET > + ldr r3, =__bss_end /* this is auto-relocated! */ > + mov r1, #0x00000000 /* prepare zero to clear BSS */ > + > + subs r2, r3, r0 /* r2 = memset len */ > + bl memset > +#else > + ldr r1, =__bss_end /* this is auto-relocated! */ > mov r2, #0x00000000 /* prepare zero to clear BSS */ > > clbss_l:cmp r0, r1 /* while not at end of BSS */ > strlo r2, [r0] /* clear 32-bit BSS word */ > addlo r0, r0, #4 /* move to next */ > blo clbss_l > +#endif > > bl coloured_LED_init > bl red_led_on > This commit left unchanged. After boot time test using oscilloscope and the clock cycle counter I didn't noticed a time difference in more then one ms. In this case I think that insert a duplicated code here, has no sense. Best regards, -- Przemyslaw Marczak Samsung R&D Institute Poland Samsung Electronics p.marczak at samsung.com