From: Tao Xu <tao3.xu@intel.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: "mdroth@linux.vnet.ibm.com" <mdroth@linux.vnet.ibm.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"ehabkost@redhat.com" <ehabkost@redhat.com>
Subject: Re: [PATCH] util/cutils: Expand do_strtosz parsing precision to 64 bits
Date: Wed, 18 Dec 2019 13:26:32 +0800 [thread overview]
Message-ID: <52a11f3a-f2c8-26e9-b823-0093cfe91fdc@intel.com> (raw)
In-Reply-To: <e731445a-4461-3212-c08d-05dc7ad2b742@intel.com>
On 12/18/2019 9:33 AM, Tao Xu wrote:
> On 12/17/2019 6:25 PM, Markus Armbruster wrote:
>> Tao Xu <tao3.xu@intel.com> writes:
>>
>>> On 12/5/19 11:29 PM, Markus Armbruster wrote:
>>>> Tao Xu <tao3.xu@intel.com> writes:
>>>>
>>>>> Parse input string both as a double and as a uint64_t, then use the
>>>>> method which consumes more characters. Update the related test cases.
>>>>>
>>>>> Signed-off-by: Tao Xu <tao3.xu@intel.com>
>>>>> ---
>>>> [...]
>>>>> diff --git a/util/cutils.c b/util/cutils.c
>>>>> index 77acadc70a..b08058c57c 100644
>>>>> --- a/util/cutils.c
>>>>> +++ b/util/cutils.c
>>>>> @@ -212,24 +212,43 @@ static int do_strtosz(const char *nptr, const
>>>>> char **end,
>>>>> const char default_suffix, int64_t unit,
>>>>> uint64_t *result)
>>>>> {
>>>>> - int retval;
>>>>> - const char *endptr;
>>>>> + int retval, retd, retu;
>>>>> + const char *suffix, *suffixd, *suffixu;
>>>>> unsigned char c;
>>>>> int mul_required = 0;
>>>>> - double val, mul, integral, fraction;
>>>>> + bool use_strtod;
>>>>> + uint64_t valu;
>>>>> + double vald, mul, integral, fraction;
>>>>
>>>> Note for later: @mul is double.
>>>>
>>>>> +
>>>>> + retd = qemu_strtod_finite(nptr, &suffixd, &vald);
>>>>> + retu = qemu_strtou64(nptr, &suffixu, 0, &valu);
>>
>> Note for later: passing 0 to base accepts octal and hexadecimal
>> integers.
>>
>>>>> + use_strtod = strlen(suffixd) < strlen(suffixu);
>>>>> +
>>>>> + /*
>>>>> + * Parse @nptr both as a double and as a uint64_t, then use
>>>>> the method
>>>>> + * which consumes more characters.
>>>>> + */
>>>>
>>>> The comment is in a funny place. I'd put it right before the
>>>> qemu_strtod_finite() line.
>>>>
>>>>> + if (use_strtod) {
>>>>> + suffix = suffixd;
>>>>> + retval = retd;
>>>>> + } else {
>>>>> + suffix = suffixu;
>>>>> + retval = retu;
>>>>> + }
>>>>> - retval = qemu_strtod_finite(nptr, &endptr, &val);
>>>>> if (retval) {
>>>>> goto out;
>>>>> }
>>>>
>>>> This is even more subtle than it looks.
>>>>
>>>> A close reading of the function contracts leads to three cases for each
>>>> conversion:
>>>>
>>>> * parse error (including infinity and NaN)
>>>>
>>>> @retu / @retd is -EINVAL
>>>> @valu / @vald is uninitialized
>>>> @suffixu / @suffixd is @nptr
>>>>
>>>> * range error
>>>>
>>>> @retu / @retd is -ERANGE
>>>> @valu / @vald is our best approximation of the conversion result
>>>> @suffixu / @suffixd points to the first character not consumed
>>>> by the
>>>> conversion.
>>>>
>>>> Sub-cases:
>>>>
>>>> - uint64_t overflow
>>>>
>>>> We know the conversion result exceeds UINT64_MAX.
>>>>
>>>> - double overflow
>>>>
>>>> we know the conversion result's magnitude exceeds the largest
>>>> representable finite double DBL_MAX.
>>>>
>>>> - double underflow
>>>>
>>>> we know the conversion result is close to zero (closer than
>>>> DBL_MIN,
>>>> the smallest normalized positive double).
>>>>
>>>> * success
>>>>
>>>> @retu / @retd is 0
>>>> @valu / @vald is the conversion result
>>>> @suffixu / @suffixd points to the first character not consumed
>>>> by the
>>>> conversion.
>>>>
>>>> This leads to a matrix (parse error, uint64_t overflow, success) x
>>>> (parse error, double overflow, double underflow, success). We need to
>>>> check the code does what we want for each element of this matrix, and
>>>> document any behavior that's not perfectly obvious.
>>>>
>>>> (success, success): we pick uint64_t if qemu_strtou64() consumed more
>>>> characters than qemu_strtod_finite(), else double. "More" is important
>>>> here; when they consume the same characters, we *need* to use the
>>>> uint64_t result. Example: for "18446744073709551615", we need to use
>>>> uint64_t 18446744073709551615, not double 18446744073709551616.0. But
>>>> for "18446744073709551616.", we need to use the double. Good.
>>
>> Also fun: for "0123", we use uint64_t 83, not double 123.0. But for
>> "0123.", we use 123.0, not 83.
>>
>> Do we really want to accept octal and hexadecimal integers?
>>
>
> Thank you for reminding me. Octal and hexadecimal may bring more
> confusion. I will use qemu_strtou64(nptr, &suffixu, 10, &valu) and add
> test for input like "0123".
>
Hi Markus,
After I use qemu_strtou64(nptr, &suffixu, 10, &valu), it cause another
question. Because qemu_strtod_finite support hexadecimal input, so in
this situation, it will parsed as double. It will also let large
hexadecimal integers be rounded. So there may be two solution:
1: use qemu_strtou64(nptr, &suffixu, 0, &valu) and parse octal as
decimal. This will keep hexadecimal valid as now.
"0123" --> 123; "0x123" --> 291
2: use qemu_strtou64(nptr, &suffixu, 10, &valu) and reject octal and
decimal.
"0123" --> Error; "0x123" --> Error
next prev parent reply other threads:[~2019-12-18 5:27 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-05 2:14 [PATCH] util/cutils: Expand do_strtosz parsing precision to 64 bits Tao Xu
2019-12-05 15:29 ` Markus Armbruster
2019-12-09 5:38 ` Tao Xu
2019-12-17 10:25 ` Markus Armbruster
2019-12-18 1:33 ` Tao Xu
2019-12-18 5:26 ` Tao Xu [this message]
2019-12-18 18:26 ` Markus Armbruster
2019-12-19 7:43 ` Tao Xu
2019-12-19 10:15 ` Markus Armbruster
2019-12-18 21:49 ` Eric Blake
2019-12-17 12:04 ` Christophe de Dinechin
2019-12-17 14:08 ` Markus Armbruster
2019-12-17 14:12 ` Christophe de Dinechin
2019-12-17 15:01 ` Markus Armbruster
2019-12-18 2:29 ` Tao Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52a11f3a-f2c8-26e9-b823-0093cfe91fdc@intel.com \
--to=tao3.xu@intel.com \
--cc=armbru@redhat.com \
--cc=ehabkost@redhat.com \
--cc=mdroth@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).