From mboxrd@z Thu Jan 1 00:00:00 1970 From: linux@arm.linux.org.uk (Russell King - ARM Linux) Date: Mon, 10 Jan 2011 10:51:20 +0000 Subject: [PATCH] ARM: sched_clock: improve mult/shift accuracy with high frequency clocks In-Reply-To: References: <20101221105149.GO28157@n2100.arm.linux.org.uk> <20110103003718.GF17727@n2100.arm.linux.org.uk> <1294084049.2571.30.camel@work-vm> <20110109105200.GC31708@n2100.arm.linux.org.uk> Message-ID: <20110110105120.GA12552@n2100.arm.linux.org.uk> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Sun, Jan 09, 2011 at 10:55:04PM -0500, Nicolas Pitre wrote: > However this begs the question about the actual meaning of the value for > the minsec argument to clocks_calc_mult_shift() (which IMHO should be > renamed to maxsec instead). In the ARM sched_clock code the value of 60 > is totally arbitrary and may happen to be good enough, but a value of 0 > would also be totally arbitrary and also work fine. But at least a 0 > value wouldn't imply any false meaning. And in the case of the > sched_clock support code, we know the value we need: 90% > of the actual hardware clock period, so using that would at least make > the code self consistent even if in practice this doesn't change the > final results. Actually, minsec is utterly wrong. minsec is there to clamp the conversion from the N-bit cyclecounter to a 64-bit nanosecond value to ensure that there isn't a 64-bit overflow within the 'minsec' period. With a 32-bit or smaller cyclecounter, as 32-bit x 32-bit can never overflow a 64-bit destination, so if anything zero should be passed in this case. If larger than 32-bit, then a value may be needed to clamp it. However, wrap = (1 << bits) / freq ns = (mult * cnt) >> shift mult = (NSEC_PER_SEC (a 30-bit number) << 32) / freq (32-bit max) A 33-bit counter would need a 32-bit multiplier to wrap-around within the 64-bit maths. The frequency which produces a 32-bit multiplier is 2GHz, which gives a wrap period of 4.29s. Above this frequency, the 64-bit math can't overflow as the multiplier becomes smaller. Below this frequency, counter wrap periods get longer and the multiplier becomes larger up to 32 bits - and this is where the 64-bit math problem starts. A 33-bit counter with a 1.8GHz clock gives a multiplier of 2386092942 (0x8E38E38E). Such a multiplier wraps 64-bit maths@7730941133 and it takes the counter 4.29s to get there. 1 << bits / freq gives a counter wrap period of 4.77s, which is over-estimating the 64-bit math wrap. I question whether using 1 << bits / freq is valid for minsec - does there exist a frequency where an integer minsec is larger than required. Luckily the maths is safe as it'll produce a smaller mult. However, I question whether using 1 << bits / freq is any arbitary than a 60 or 0 value - it's certainly mathematically the wrong wrap period. So, in summary I'd suggest using a value of 0 for sched_clock() if we're going to change it - we don't accept more than 32-bits from the counter at present, so the whole minsec thing really isn't needed to prevent wrap. If we ever allow more than 32-bits then yes it will.