All of lore.kernel.org
 help / color / mirror / Atom feed
From: Markus Armbruster <armbru@redhat.com>
To: Christophe de Dinechin <dinechin@redhat.com>
Cc: Tao Xu <tao3.xu@intel.com>,
	mdroth@linux.vnet.ibm.com, ehabkost@redhat.com,
	qemu-devel@nongnu.org
Subject: Re: [PATCH] util/cutils: Expand do_strtosz parsing precision to 64 bits
Date: Tue, 17 Dec 2019 16:01:05 +0100	[thread overview]
Message-ID: <87o8w7ate6.fsf@dusky.pond.sub.org> (raw)
In-Reply-To: <50B6643C-78A7-4724-8A0B-3D51B1898FFE@redhat.com> (Christophe de Dinechin's message of "Tue, 17 Dec 2019 15:12:52 +0100")

Christophe de Dinechin <dinechin@redhat.com> writes:

>> On 17 Dec 2019, at 15:08, Markus Armbruster <armbru@redhat.com> wrote:
>> 
>> Christophe de Dinechin <dinechin@redhat.com> writes:
>> 
>>>> On 5 Dec 2019, at 16:29, Markus Armbruster <armbru@redhat.com> wrote:
>>>> 
>>>> Tao Xu <tao3.xu@intel.com> writes:
>>>> 
>>>>> Parse input string both as a double and as a uint64_t, then use the
>>>>> method which consumes more characters. Update the related test cases.
>>>>> 
>>>>> Signed-off-by: Tao Xu <tao3.xu@intel.com>
>>>>> ---
>>>> [...]
>>>>> diff --git a/util/cutils.c b/util/cutils.c
>>>>> index 77acadc70a..b08058c57c 100644
>>>>> --- a/util/cutils.c
>>>>> +++ b/util/cutils.c
>>>>> @@ -212,24 +212,43 @@ static int do_strtosz(const char *nptr, const char **end,
>>>>>                      const char default_suffix, int64_t unit,
>>>>>                      uint64_t *result)
>>>>> {
>>>>> -    int retval;
>>>>> -    const char *endptr;
>>>>> +    int retval, retd, retu;
>>>>> +    const char *suffix, *suffixd, *suffixu;
>>>>>    unsigned char c;
>>>>>    int mul_required = 0;
>>>>> -    double val, mul, integral, fraction;
>>>>> +    bool use_strtod;
>>>>> +    uint64_t valu;
>>>>> +    double vald, mul, integral, fraction;
>>>> 
>>>> Note for later: @mul is double.
>>>> 
>>>>> +
>>>>> +    retd = qemu_strtod_finite(nptr, &suffixd, &vald);
>>>>> +    retu = qemu_strtou64(nptr, &suffixu, 0, &valu);
>>>>> +    use_strtod = strlen(suffixd) < strlen(suffixu);
>>>>> +
>>>>> +    /*
>>>>> +     * Parse @nptr both as a double and as a uint64_t, then use the method
>>>>> +     * which consumes more characters.
>>>>> +     */
>>>> 
>>>> The comment is in a funny place.  I'd put it right before the
>>>> qemu_strtod_finite() line.
>>>> 
>>>>> +    if (use_strtod) {
>>>>> +        suffix = suffixd;
>>>>> +        retval = retd;
>>>>> +    } else {
>>>>> +        suffix = suffixu;
>>>>> +        retval = retu;
>>>>> +    }
>>>>> 
>>>>> -    retval = qemu_strtod_finite(nptr, &endptr, &val);
>>>>>    if (retval) {
>>>>>        goto out;
>>>>>    }
>>>> 
>>>> This is even more subtle than it looks.
>>> 
>>> But why it is even necessary?
>>> 
>>> The “contract” for the function used to be that it returned rounded values
>>> beyond 2^53, which in itself is curious.
>>> 
>>> But now it’s a 6-dimensional matrix of hell with NaNs and barfnots, when the
>>> name implies it’s simply doing a text to u64 conversion…
>>> 
>>> There is certainly a reason, but I’m really curious what it is :-)
>> 
>> It all goes back to commit 9f9b17a4f0 "Introduce strtosz() library
>> function to convert a string to a byte count.".  To support "convenient"
>> usage like "1.5G", it parses the number part with strtod().  This limits
>> us to 53 bits of precision.  Larger sizes get rounded.
>> 
>> I guess the excuse for this was that when you're dealing with sizes that
>> large (petabytes!), your least significant bits are zero anyway.
>> 
>> Regardless, the interface is *awful*.  We should've forced the author to
>> spell it out in all its glory in a proper function contract.  That tends
>> to cool the enthusiasm for "convenient" syntax amazingly fast.
>> 
>> The awful interface has been confusing people for close to a decade now.
>> 
>> What to do?
>
> I see. Thanks for the rationale. I knew it had to make sense :-)

For a value of "sense"...

> I’d probably avoid strtod even with the convenient syntax above.
> Do you want 1.33e-6M to be allowed? Do we want to ever
> accept or generate NaN or Inf values?

NaN or Inf definitely not.  That's why we use qemu_strtod_finite()
before and after the patch.

No sane person should ever use 1.33e-6M.  Or even 1.1k (which yields
1126, rounded silently from machine number 1126.4000000000001, which
approximates the true value 1126.4).

Certain fractions are actually sane.  1.5k denotes a perfectly fine
integer, which the code manages not to screw up.  I'd recommend against
using fractions regardless.

What usage are we prepared to break?  What kind of confusion are we
willing to bear?  Those are the questions.

>> Tao Xu's patch tries to make the function do what its users expect,
>> namely parse a bleepin' 64 bit integer, without breaking any of the
>> "convenience" syntax.  Turns out that's amazingly subtle.  Are we making
>> things less confusing or more?



  reply	other threads:[~2019-12-17 15:03 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-05  2:14 [PATCH] util/cutils: Expand do_strtosz parsing precision to 64 bits Tao Xu
2019-12-05 15:29 ` Markus Armbruster
2019-12-09  5:38   ` Tao Xu
2019-12-17 10:25     ` Markus Armbruster
2019-12-18  1:33       ` Tao Xu
2019-12-18  5:26         ` Tao Xu
2019-12-18 18:26           ` Markus Armbruster
2019-12-19  7:43             ` Tao Xu
2019-12-19 10:15               ` Markus Armbruster
2019-12-18 21:49         ` Eric Blake
2019-12-17 12:04   ` Christophe de Dinechin
2019-12-17 14:08     ` Markus Armbruster
2019-12-17 14:12       ` Christophe de Dinechin
2019-12-17 15:01         ` Markus Armbruster [this message]
2019-12-18  2:29           ` Tao Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o8w7ate6.fsf@dusky.pond.sub.org \
    --to=armbru@redhat.com \
    --cc=dinechin@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=mdroth@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=tao3.xu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.