From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Graf Date: Wed, 26 Oct 2016 09:14:16 +0200 Subject: [U-Boot] [RFC PATCH] lib/timer: initialize timebase_l/timebase_h In-Reply-To: References: <1477356712-7015-1-git-send-email-andre.przywara@arm.com> <0562374b-040f-cdb5-b1f7-6c718ba777e1@suse.de> Message-ID: <581057C8.7010203@suse.de> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de On 10/26/2016 02:07 AM, Andr? Przywara wrote: > On 25/10/16 08:52, Alexander Graf wrote: > > Hi Alex, > > thanks for looking at this! > >> On 25/10/2016 02:51, Andre Przywara wrote: >>> On systems using the generic timer routines defined in lib/time.c we >>> use timebase_l and timebase_h fields from the gd to detect wraparounds >>> in our tick counter. The tick calculcation algorithm silently assumes >>> that a long is only 32 bits, which leads to wrong results when timebase_h >>> is not 0 on 64-bit systems. >>> As one possible fix lets initialize timebase_h (and timebase_l) to 0, so >>> on 64-bit systems timebase_h will never(TM) be bigger than 0 and thus >>> cannot spoil timebase_l by being ORed into it. >>> >>> This fixes occasional timeout issues on the Pine64 (and possibly other >>> ARM64 systems). > Well, not really (this fix isn't sufficient), but read on ... > >>> Signed-off-by: Andre Przywara >>> --- >>> Hi, >>> >>> I am bit puzzled what the proper fix is, this one is the easiest I could >>> come up with. Not sure if the gd should be zeroed normally (and it's just >>> broken on sunxi/arm64 because of some linker issues) or whether we really >>> forgot to initialize those fields and just got away with it. >> The gd clearing happens via crt0_64.S -> board_init_f_init_reserve(). >> There we should have fully cleared all memory associated with global data. > Ah,thanks for pointing to that. I was a bit clueless where to start > looking for it - "git grep gd" is obviously not a good idea ;-) > >> I can't see anything obviously wrong in that code. Could you try to dump >> gd if the timer offsets are != 0 on init? Maybe we can conclude >> something from the data in it. > So I agree that this code looks sane and indeed in all my dumps it looks > like the initialization works fine. > I did some more debugging and learned that the increased timebase_h > comes from detected overflows: in fact some readings are really lower > than previous ones: > ... > MMC: SUNXI SD/MMC: 0 > get_ticks() overflow: now: 118046720, tbl: 118063103, tbh: 0 > get_ticks() overflow: now: 118439936, tbl: 118456318, tbh: 1 > get_ticks() overflow: now: 118571008, tbl: 118587391, tbh: 2 > get_ticks() overflow: now: 118734848, tbl: 118751230, tbh: 3 > get_ticks() overflow: now: 119422976, tbl: 119439358, tbh: 4 > get_ticks() overflow: now: 119783424, tbl: 119799807, tbh: 5 > get_ticks() overflow: now: 120045568, tbl: 120061950, tbh: 6 > get_ticks() overflow: now: 120406016, tbl: 120422398, tbh: 7 > *** Warning - bad CRC, using default environment > ...... > Not sure how this actually happens - I am not aware of any such severe > hardware errata in the A53 r0p4 or the timer, also I think that would > have bitten us elsewhere already. > As ATF keeps the secondaries in WFI, U-Boot only runs on CPU0 (confirmed > by MPIDR reads). > Also U-Boot reads the physical counter, so a bogus CNTVOFF can also not > be the culprit. > > So I can fix this annoying issue by using one of the other proposed > fixes (handling timebase_h only if BITS_PER_LONG < 64 or defining > get_ticks in armv8/generic_timer.c), but it would still be interesting > to find the real root cause. Can you try and ask around? If this bites us in U-Boot, there's a good chance Linux timers should be broken too, no? I remember that NXP had similar problems with the timer: https://patchwork.kernel.org/patch/9344727/ Alex