From mboxrd@z Thu Jan 1 00:00:00 1970 From: arnd@arndb.de (Arnd Bergmann) Date: Fri, 05 Dec 2014 11:08:07 +0100 Subject: [PATCH] optimize ktime_divns for constant divisors In-Reply-To: References: Message-ID: <2145860.PBxl6kLNRF@wuerfel> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Thursday 04 December 2014 23:30:08 Nicolas Pitre wrote: > > res += (u64)x_lo * y_hi + (u64)x_hi * y_lo; > > That, too, risk overflowing. > > Let's say x_lo = 0xffffffff and x_hi = 0xffffffff. You get: > > 0xffffffff * 0x83126e97 -> 0x83126e967ced9169 > 0xffffffff * 0x8d4fdf3b -> 0x8d4fdf3a72b020c5 > ------------------- > 0x110624dd0ef9db22e > > Therefore the sum doesn't fit into a u64 variable. > > It is possible to skip carry handling but only when the MSB of both > constants are zero. Here it is not the case. If I understand this right, there are two possible optimizations to avoid the overflow: - for anything using monotonic time, or elapsed time, we can guarantee that the upper bits are zero. Relying on monotonic time is a bit dangerous, because that would mean introducing an API that works with ktime_get() but not ktime_get_real(), and risk introducing subtle bugs. However, ktime_us_delta() could be optimized, and we can introduce similar ktime_sec_delta() and ktime_ms_delta() functions with the same properties. - one could always pre-shift the ktime_t value. For a division by 1000, we can shift right by 3 bits first, then do the multiplication and then do another shift. Not sure if that helps at all or if the extra shift operation makes this counterproductive. Arnd