From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44632) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zw5aZ-0007qT-Rj for qemu-devel@nongnu.org; Tue, 10 Nov 2015 04:51:48 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zw5aX-0003nt-3O for qemu-devel@nongnu.org; Tue, 10 Nov 2015 04:51:43 -0500 Received: from mx1.redhat.com ([209.132.183.28]:48685) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zw5aW-0003np-Rr for qemu-devel@nongnu.org; Tue, 10 Nov 2015 04:51:41 -0500 References: <1447080991-24995-1-git-send-email-peter.maydell@linaro.org> <56411D73.1030603@redhat.com> <5641B185.4060206@redhat.com> <5641B84A.3070906@redhat.com> From: Laszlo Ersek Message-ID: <5641BE29.6030403@redhat.com> Date: Tue, 10 Nov 2015 10:51:37 +0100 MIME-Version: 1.0 In-Reply-To: <5641B84A.3070906@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH for-2.5] hw/timer/hpet.c: Avoid signed integer overflow which results in bugs on OSX List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini , Peter Maydell , qemu-devel@nongnu.org Cc: Aaron Elkins , patches@linaro.org, "Michael S. Tsirkin" On 11/10/15 10:26, Paolo Bonzini wrote: > > > On 10/11/2015 09:57, Laszlo Ersek wrote: >> On 11/09/15 23:25, Laszlo Ersek wrote: >>> On 11/09/15 15:56, Peter Maydell wrote: >>>> Signed integer overflow in C is undefined behaviour, and the compiler >>>> is at liberty to assume it can never happen and optimize accordingly. >>>> In particular, the subtractions in hpet_time_after() and hpet_time_after64() >>>> were causing OSX clang to optimize the code such that it was prone to >>>> hangs and complaints about the main loop stalling (presumably because >>>> we were spending all our time trying to service very high frequency >>>> HPET timer callbacks). The clang sanitizer confirms the UB: >>>> >>>> hw/timer/hpet.c:119:26: runtime error: signed integer overflow: -2146967296 - 2147003978 cannot be represented in type 'int' >>>> >>>> Fix this by doing the subtraction as an unsigned operation and then >>>> converting to signed for the comparison. >>>> >>>> Reported-by: Aaron Elkins >>>> Signed-off-by: Peter Maydell >>>> --- >>>> hw/timer/hpet.c | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c >>>> index 3037bef..7f0391c 100644 >>>> --- a/hw/timer/hpet.c >>>> +++ b/hw/timer/hpet.c >>>> @@ -116,12 +116,12 @@ static uint32_t timer_enabled(HPETTimer *t) >>>> >>>> static uint32_t hpet_time_after(uint64_t a, uint64_t b) >>>> { >>>> - return ((int32_t)(b) - (int32_t)(a) < 0); >>>> + return ((int32_t)(b - a) < 0); >>>> } >>>> >>>> static uint32_t hpet_time_after64(uint64_t a, uint64_t b) >>>> { >>>> - return ((int64_t)(b) - (int64_t)(a) < 0); >>>> + return ((int64_t)(b - a) < 0); >>>> } >>>> >>>> static uint64_t ticks_to_ns(uint64_t value) >>>> >>> >>> I'm late to the discussion, but I cannot imagine what would speak against: >>> >>> return (b < a); > > With uint32_t, b < a is wrong if b has just overflowed and a is just > below 2^32. > > With int32_t, b < a is wrong if b is just above 2^31 and a is just below > 2^31. > > Basically you want to consider a sliding window around (a+b)/2 (where > a+b is computed with "infinite" precision), and see whether it's a or b > that comes before the average. Thanks! (I guess / hope this is about the same that I managed to realize on my own in my other email :)) > For int64_t/uint64_t it is indeed moot, because it takes centuries > before you get close to 2^63 ticks (QEMU's emulated HPET has a 100 MHz > frequency; one year is 86400*365.25*10^8 ticks, or about 2^51.5). Finally! I resisted the urge to write "yet another hardware clock / counter that overflows within a humanly observable interval, *groan*". But, now that you say that the 64-bit HPET fixes (or may fix) that, I don't have to hold back. :) Thanks Laszlo > > Paolo > >>> The post-patch code still converts a uint64_t difference to int32_t. >>> According to the C standard(s), such a conversion (i.e., when the >>> integer value being converted doesn't fit in the target signed integer) >>> results in an implementation-defined value, or an implementation-defined >>> signal is raised. >>> >>> On our platforms, the impl-def value is determined by "truncate to 32 >>> bits, then reinterpret the bit pattern as two's complement signed >>> int32_t". Meaning, if: >>> >>> (b > a) && ((b - a) & (1u << 31)) >>> >>> (that is, "b" is so much larger than "a" that bit#31 is set in the (b-a) >>> difference), then hpet_time_after() will now incorrectly return 1. >>> (Because bit#31 will be interpreted as the sign bit, turned on.) >>> >>> Again, what speaks against >>> >>> return (b < a); >>> >>> ? >>> >>> (The pre-patch code dates back to commit 16b29ae1 (year 2008), which >>> offers precious little justification for the formula.) >> >> An hour or so after sending this email, I think I got an idea about the >> code's intent. (Knowing practically nothing about HPET.) I guess the >> HPET provides counters that can wrap around, so if you don't look >> frequently enough, you won't know if the value is actually smaller or >> greater (because you can't use raw magnitude to tell that). >> >> So I *guess* this code implemented the following idea: assume you have a >> "last value", and a reading (?) from "just a bit later". You take the >> neighborhood (with radius 2^31, or 2^63) of the "last value", and if the >> new reading falls into the upper half of that neighborhood, you say "the >> value has grown". >> >> This idea is actually very well suited for uintN_t modular arithmetic, >> because the (x - y) difference expresses the number of times you have to >> increment y to make it fall into the same remainder class as x, modulo 2^N. >> >> Hence, ((x - y) < 2^(N-1)) expresses "x is later than or equal to y" >> (with both x and y being uintN_t variables). Equivalently, we have ((x - >> y) >= 2^(N-1)) meaning "x is strictly earlier than y", which can also be >> said as "y is strictly after x". >> >> And I think that's exactly what these functions implement: >> >> - Their names say "time after". >> >> - The condition >> >> (x - y) >= 2^(N-1) >> >> tests exactly whether the most significant bit is set in the >> difference. >> >> When the bit pattern of the difference is reinterpreted as intN_t, >> that in turn means >> >> (intN_t)(x - y) < 0 >> >> So the functions seem to check if "a is strictly after b". >> >> ... The call sites seem to confirm this: >> >> if (t->config & HPET_TN_32BIT) { >> while (hpet_time_after(cur_tick, t->cmp)) { >> t->cmp = (uint32_t)(t->cmp + t->period); >> } >> } else { >> while (hpet_time_after64(cur_tick, t->cmp)) { >> t->cmp += period; >> } >> } >> >> The loops increment "t->cmp" as long as "cur_tick is strictly after >> t->cmp"; in other words, the loops make "t->cmp" catch up with "cur_tick". >> >> ... I think the functions are right after all, it's just that the >> following would have matched my personal taste more: >> >> b - a >= 1u << 31 >> >> and >> >> b - a >= 1ull << 63 >> >> (Because they don't have any impl-def parts in them, plus to me they >> make the intent, with the modular arithmetic and the "neighborhoods", >> clearer.) >> >> I guess for others it's the opposite... :) >> >> Cheers >> Laszlo >>