From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:44632)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <lersek@redhat.com>) id 1Zw5aZ-0007qT-Rj
	for qemu-devel@nongnu.org; Tue, 10 Nov 2015 04:51:48 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <lersek@redhat.com>) id 1Zw5aX-0003nt-3O
	for qemu-devel@nongnu.org; Tue, 10 Nov 2015 04:51:43 -0500
Received: from mx1.redhat.com ([209.132.183.28]:48685)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <lersek@redhat.com>) id 1Zw5aW-0003np-Rr
	for qemu-devel@nongnu.org; Tue, 10 Nov 2015 04:51:41 -0500
References: <1447080991-24995-1-git-send-email-peter.maydell@linaro.org>
	<56411D73.1030603@redhat.com> <5641B185.4060206@redhat.com>
	<5641B84A.3070906@redhat.com>
From: Laszlo Ersek <lersek@redhat.com>
Message-ID: <5641BE29.6030403@redhat.com>
Date: Tue, 10 Nov 2015 10:51:37 +0100
MIME-Version: 1.0
In-Reply-To: <5641B84A.3070906@redhat.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH for-2.5] hw/timer/hpet.c: Avoid signed
 integer overflow which results in bugs on OSX
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Paolo Bonzini <pbonzini@redhat.com>, Peter Maydell <peter.maydell@linaro.org>, qemu-devel@nongnu.org
Cc: Aaron Elkins <threcius@yahoo.com>, patches@linaro.org, "Michael S. Tsirkin" <mst@redhat.com>

On 11/10/15 10:26, Paolo Bonzini wrote:
> 
> 
> On 10/11/2015 09:57, Laszlo Ersek wrote:
>> On 11/09/15 23:25, Laszlo Ersek wrote:
>>> On 11/09/15 15:56, Peter Maydell wrote:
>>>> Signed integer overflow in C is undefined behaviour, and the compiler
>>>> is at liberty to assume it can never happen and optimize accordingly.
>>>> In particular, the subtractions in hpet_time_after() and hpet_time_after64()
>>>> were causing OSX clang to optimize the code such that it was prone to
>>>> hangs and complaints about the main loop stalling (presumably because
>>>> we were spending all our time trying to service very high frequency
>>>> HPET timer callbacks). The clang sanitizer confirms the UB:
>>>>
>>>> hw/timer/hpet.c:119:26: runtime error: signed integer overflow: -2146967296 - 2147003978 cannot be represented in type 'int'
>>>>
>>>> Fix this by doing the subtraction as an unsigned operation and then
>>>> converting to signed for the comparison.
>>>>
>>>> Reported-by: Aaron Elkins <threcius@yahoo.com>
>>>> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
>>>> ---
>>>>  hw/timer/hpet.c | 4 ++--
>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c
>>>> index 3037bef..7f0391c 100644
>>>> --- a/hw/timer/hpet.c
>>>> +++ b/hw/timer/hpet.c
>>>> @@ -116,12 +116,12 @@ static uint32_t timer_enabled(HPETTimer *t)
>>>>  
>>>>  static uint32_t hpet_time_after(uint64_t a, uint64_t b)
>>>>  {
>>>> -    return ((int32_t)(b) - (int32_t)(a) < 0);
>>>> +    return ((int32_t)(b - a) < 0);
>>>>  }
>>>>  
>>>>  static uint32_t hpet_time_after64(uint64_t a, uint64_t b)
>>>>  {
>>>> -    return ((int64_t)(b) - (int64_t)(a) < 0);
>>>> +    return ((int64_t)(b - a) < 0);
>>>>  }
>>>>  
>>>>  static uint64_t ticks_to_ns(uint64_t value)
>>>>
>>>
>>> I'm late to the discussion, but I cannot imagine what would speak against:
>>>
>>>     return (b < a);
> 
> With uint32_t, b < a is wrong if b has just overflowed and a is just
> below 2^32.
> 
> With int32_t, b < a is wrong if b is just above 2^31 and a is just below
> 2^31.
> 
> Basically you want to consider a sliding window around (a+b)/2 (where
> a+b is computed with "infinite" precision), and see whether it's a or b
> that comes before the average.

Thanks!

(I guess / hope this is about the same that I managed to realize on my
own in my other email :))

> For int64_t/uint64_t it is indeed moot, because it takes centuries
> before you get close to 2^63 ticks (QEMU's emulated HPET has a 100 MHz
> frequency; one year is 86400*365.25*10^8 ticks, or about 2^51.5).

Finally! I resisted the urge to write "yet another hardware clock /
counter that overflows within a humanly observable interval, *groan*".
But, now that you say that the 64-bit HPET fixes (or may fix) that, I
don't have to hold back. :)

Thanks
Laszlo

> 
> Paolo
> 
>>> The post-patch code still converts a uint64_t difference to int32_t.
>>> According to the C standard(s), such a conversion (i.e., when the
>>> integer value being converted doesn't fit in the target signed integer)
>>> results in an implementation-defined value, or an implementation-defined
>>> signal is raised.
>>>
>>> On our platforms, the impl-def value is determined by "truncate to 32
>>> bits, then reinterpret the bit pattern as two's complement signed
>>> int32_t". Meaning, if:
>>>
>>>     (b > a) && ((b - a) & (1u << 31))
>>>
>>> (that is, "b" is so much larger than "a" that bit#31 is set in the (b-a)
>>> difference), then hpet_time_after() will now incorrectly return 1.
>>> (Because bit#31 will be interpreted as the sign bit, turned on.)
>>>
>>> Again, what speaks against
>>>
>>>     return (b < a);
>>>
>>> ?
>>>
>>> (The pre-patch code dates back to commit 16b29ae1 (year 2008), which
>>> offers precious little justification for the formula.)
>>
>> An hour or so after sending this email, I think I got an idea about the
>> code's intent. (Knowing practically nothing about HPET.) I guess the
>> HPET provides counters that can wrap around, so if you don't look
>> frequently enough, you won't know if the value is actually smaller or
>> greater (because you can't use raw magnitude to tell that).
>>
>> So I *guess* this code implemented the following idea: assume you have a
>> "last value", and a reading (?) from "just a bit later". You take the
>> neighborhood (with radius 2^31, or 2^63) of the "last value", and if the
>> new reading falls into the upper half of that neighborhood, you say "the
>> value has grown".
>>
>> This idea is actually very well suited for uintN_t modular arithmetic,
>> because the (x - y) difference expresses the number of times you have to
>> increment y to make it fall into the same remainder class as x, modulo 2^N.
>>
>> Hence, ((x - y) < 2^(N-1)) expresses "x is later than or equal to y"
>> (with both x and y being uintN_t variables). Equivalently, we have ((x -
>> y) >= 2^(N-1)) meaning "x is strictly earlier than y", which can also be
>> said as "y is strictly after x".
>>
>> And I think that's exactly what these functions implement:
>>
>> - Their names say "time after".
>>
>> - The condition
>>
>>   (x - y) >= 2^(N-1)
>>
>>   tests exactly whether the most significant bit is set in the
>>   difference.
>>
>>   When the bit pattern of the difference is reinterpreted as intN_t,
>>   that in turn means
>>
>>   (intN_t)(x - y) < 0
>>
>> So the functions seem to check if "a is strictly after b".
>>
>> ... The call sites seem to confirm this:
>>
>>         if (t->config & HPET_TN_32BIT) {
>>             while (hpet_time_after(cur_tick, t->cmp)) {
>>                 t->cmp = (uint32_t)(t->cmp + t->period);
>>             }
>>         } else {
>>             while (hpet_time_after64(cur_tick, t->cmp)) {
>>                 t->cmp += period;
>>             }
>>         }
>>
>> The loops increment "t->cmp" as long as "cur_tick is strictly after
>> t->cmp"; in other words, the loops make "t->cmp" catch up with "cur_tick".
>>
>> ... I think the functions are right after all, it's just that the
>> following would have matched my personal taste more:
>>
>>   b - a >= 1u << 31
>>
>> and
>>
>>   b - a >= 1ull << 63
>>
>> (Because they don't have any impl-def parts in them, plus to me they
>> make the intent, with the modular arithmetic and the "neighborhoods",
>> clearer.)
>>
>> I guess for others it's the opposite... :)
>>
>> Cheers
>> Laszlo
>>