* [PATCH 0/3] Improve do_strtosz precision @ 2021-02-04 19:07 Eric Blake 2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake ` (2 more replies) 0 siblings, 3 replies; 31+ messages in thread From: Eric Blake @ 2021-02-04 19:07 UTC (permalink / raw) To: qemu-devel; +Cc: vsementsov, berrange, qemu-block, tao3.xu, rjones, armbru Recently, commit 8b1170012b tweaked the maximum size the block layer will allow, which in turn affects nbdkit's testsuite of edge-case behaviors, where Rich noted [1] that our use of double meant rounding errors that cause spurious failures in qemu-io (among other places). So I decided to fix that. In the process, I was reminded that we have attempted this before, most recently in Dec 2019 [2], where Markus (rightly, IMHO) said that any approach that tries to reimplement strtoull() or which compares the amount of data consumed between strtod() and strtoull() adds more complexity than it solves. So first, our analysis of what we absolutely need: - Existing clients expect decimal scaling to work (1M is a lot easier to type than 1048576) - Existing clients expect hex to work (my initial attempt disabled it, and promptly hung in iotests when things like 0x10000 got parsed as zero rather than their intended byte value) (although our existing testsuite coverage was only via iotests, rather than unit tests) - Existing clients expect sane decimal fractions to work (1.5k is easier to type than 1536), although our testsuite coverage of this was less apparant - Existing clients are surprised when something that looks like a byte value is truncated or rounded due to an intermediate pass through double (hence Rich's bug report) [1] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg00812.html [2] https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg00852.html My solution, instead of parsing twice and picking the longest, is to always parse the integral portion with strtoull(), then special case on what is left: if it is 'x' or 'X', the user wanted hex (and does NOT want a fraction, and in fact we can optionally warn about the use of scaling suffixes in this case); if it is '.', the user wanted the convenience of a floating point adjustment which we can do via strtod() on the suffix (and where we can optionally warn about fractions that are not evenly divisible into bytes). I also enhanced the testsuite (our coverage for hex constants was implicit via iotests, and now explicit in the unit test; and our testsuite did not flag the fact that we were previously accepting nonsense like '0x1.8k' for 1536, or '1.5e1M' for 53477376). We may decide that one or both of patch 2 and 3 goes too far, which is why I split them out from the main enhancement. Eric Blake (3): utils: Improve qemu_strtosz() to have 64 bits of precision utils: Deprecate hex-with-suffix sizes utils: Deprecate inexact fractional suffix sizes docs/system/deprecated.rst | 8 ++++ tests/test-cutils.c | 83 ++++++++++++++++++++++++++++--------- tests/test-keyval.c | 28 +++++++++---- tests/test-qemu-opts.c | 24 +++++++---- util/cutils.c | 85 +++++++++++++++++++++++++++++--------- tests/qemu-iotests/049.out | 7 +++- 6 files changed, 178 insertions(+), 57 deletions(-) -- 2.30.0 ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision 2021-02-04 19:07 [PATCH 0/3] Improve do_strtosz precision Eric Blake @ 2021-02-04 19:07 ` Eric Blake 2021-02-04 20:12 ` Eric Blake ` (4 more replies) 2021-02-04 19:07 ` [PATCH 2/3] utils: Deprecate hex-with-suffix sizes Eric Blake 2021-02-04 19:07 ` [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes Eric Blake 2 siblings, 5 replies; 31+ messages in thread From: Eric Blake @ 2021-02-04 19:07 UTC (permalink / raw) To: qemu-devel Cc: Kevin Wolf, vsementsov, berrange, qemu-block, tao3.xu, rjones, armbru, Max Reitz We have multiple clients of qemu_strtosz (qemu-io, the opts visitor, the keyval visitor), and it gets annoying that edge-case testing is impacted by implicit rounding to 53 bits of precision due to parsing with strtod(). As an example posted by Rich Jones: $ nbdkit memory $(( 2**63 - 2**30 )) --run \ 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' write failed: Input/output error because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is out of bounds. It is also worth noting that our existing parser, by virtue of using strtod(), accepts decimal AND hex numbers, even though test-cutils previously lacked any coverage of the latter. We do have existing clients that expect a hex parse to work (for example, iotest 33 using qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8 rather than as an invalid octal number, so we know there are no clients that depend on octal. Our use of strtod() also means that "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather than 1843 (if the fraction were 8/10); but as this was not covered in the testsuite, I have no qualms forbidding hex fractions as invalid, so this patch declares that the use of fractions is only supported with decimal input, and enhances the testsuite to document that. Our previous use of strtod() meant that -1 parsed as a negative; now that we parse with strtoull(), negative values can wrap around module 2^64, so we have to explicitly check whether the user passed in a '-'. We also had no testsuite coverage of "1.1e0k", which happened to parse under strtod() but is unlikely to occur in practice; as long as we are making things more robust, it is easy enough to reject the use of exponents in a strtod parse. The fix is done by breaking the parse into an integer prefix (no loss in precision), rejecting negative values (since we can no longer rely on strtod() to do that), determining if a decimal or hexadecimal parse was intended (with the new restriction that a fractional hex parse is not allowed), and where appropriate, using a floating point fractional parse (where we also scan to reject use of exponents in the fraction). The bulk of the patch is then updates to the testsuite to match our new precision, as well as adding new cases we reject (whether they were rejected or inadvertenly accepted before). Signed-off-by: Eric Blake <eblake@redhat.com> --- Note that this approach differs from what has been attempted in the past; see the thread starting at https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg00852.html That approach tried to parse both as strtoull and strtod and take whichever was longer, but that was harder to document. --- tests/test-cutils.c | 79 ++++++++++++++++++++++++++++++-------- tests/test-keyval.c | 24 +++++++++--- tests/test-qemu-opts.c | 20 +++++++--- util/cutils.c | 77 +++++++++++++++++++++++++++---------- tests/qemu-iotests/049.out | 7 +++- 5 files changed, 156 insertions(+), 51 deletions(-) diff --git a/tests/test-cutils.c b/tests/test-cutils.c index 1aa8351520ae..0c2c89d6f113 100644 --- a/tests/test-cutils.c +++ b/tests/test-cutils.c @@ -1960,6 +1960,13 @@ static void test_qemu_strtosz_simple(void) g_assert_cmpint(res, ==, 0); g_assert(endptr == str + 1); + /* Leading 0 gives decimal results, not octal */ + str = "08"; + err = qemu_strtosz(str, &endptr, &res); + g_assert_cmpint(err, ==, 0); + g_assert_cmpint(res, ==, 8); + g_assert(endptr == str + 2); + str = "12345"; err = qemu_strtosz(str, &endptr, &res); g_assert_cmpint(err, ==, 0); @@ -1970,7 +1977,7 @@ static void test_qemu_strtosz_simple(void) g_assert_cmpint(err, ==, 0); g_assert_cmpint(res, ==, 12345); - /* Note: precision is 53 bits since we're parsing with strtod() */ + /* Note: If there is no '.', we get full 64 bits of precision */ str = "9007199254740991"; /* 2^53-1 */ err = qemu_strtosz(str, &endptr, &res); @@ -1987,7 +1994,7 @@ static void test_qemu_strtosz_simple(void) str = "9007199254740993"; /* 2^53+1 */ err = qemu_strtosz(str, &endptr, &res); g_assert_cmpint(err, ==, 0); - g_assert_cmpint(res, ==, 0x20000000000000); /* rounded to 53 bits */ + g_assert_cmpint(res, ==, 0x20000000000001); g_assert(endptr == str + 16); str = "18446744073709549568"; /* 0xfffffffffffff800 (53 msbs set) */ @@ -1999,11 +2006,35 @@ static void test_qemu_strtosz_simple(void) str = "18446744073709550591"; /* 0xfffffffffffffbff */ err = qemu_strtosz(str, &endptr, &res); g_assert_cmpint(err, ==, 0); - g_assert_cmpint(res, ==, 0xfffffffffffff800); /* rounded to 53 bits */ + g_assert_cmpint(res, ==, 0xfffffffffffffbff); g_assert(endptr == str + 20); - /* 0x7ffffffffffffe00..0x7fffffffffffffff get rounded to - * 0x8000000000000000, thus -ERANGE; see test_qemu_strtosz_erange() */ + str = "18446744073709551615"; /* 0xffffffffffffffff */ + err = qemu_strtosz(str, &endptr, &res); + g_assert_cmpint(err, ==, 0); + g_assert_cmpint(res, ==, 0xffffffffffffffff); + g_assert(endptr == str + 20); + +} + +static void test_qemu_strtosz_hex(void) +{ + const char *str; + const char *endptr; + int err; + uint64_t res = 0xbaadf00d; + + str = "0x0"; + err = qemu_strtosz(str, &endptr, &res); + g_assert_cmpint(err, ==, 0); + g_assert_cmpint(res, ==, 0); + g_assert(endptr == str + 3); + + str = "0xa"; + err = qemu_strtosz(str, &endptr, &res); + g_assert_cmpint(err, ==, 0); + g_assert_cmpint(res, ==, 10); + g_assert(endptr == str + 3); } static void test_qemu_strtosz_units(void) @@ -2106,6 +2137,21 @@ static void test_qemu_strtosz_invalid(void) err = qemu_strtosz(str, &endptr, &res); g_assert_cmpint(err, ==, -EINVAL); g_assert(endptr == str); + + str = "1.1e5"; + err = qemu_strtosz(str, &endptr, &res); + g_assert_cmpint(err, ==, -EINVAL); + g_assert(endptr == str); + + str = "1.1B"; + err = qemu_strtosz(str, &endptr, &res); + g_assert_cmpint(err, ==, -EINVAL); + g_assert(endptr == str); + + str = "0x1.8k"; + err = qemu_strtosz(str, &endptr, &res); + g_assert_cmpint(err, ==, -EINVAL); + g_assert(endptr == str); } static void test_qemu_strtosz_trailing(void) @@ -2145,17 +2191,7 @@ static void test_qemu_strtosz_erange(void) g_assert_cmpint(err, ==, -ERANGE); g_assert(endptr == str + 2); - str = "18446744073709550592"; /* 0xfffffffffffffc00 */ - err = qemu_strtosz(str, &endptr, &res); - g_assert_cmpint(err, ==, -ERANGE); - g_assert(endptr == str + 20); - - str = "18446744073709551615"; /* 2^64-1 */ - err = qemu_strtosz(str, &endptr, &res); - g_assert_cmpint(err, ==, -ERANGE); - g_assert(endptr == str + 20); - - str = "18446744073709551616"; /* 2^64 */ + str = "18446744073709551616"; /* 2^64; see strtosz_simple for 2^64-1 */ err = qemu_strtosz(str, &endptr, &res); g_assert_cmpint(err, ==, -ERANGE); g_assert(endptr == str + 20); @@ -2168,15 +2204,22 @@ static void test_qemu_strtosz_erange(void) static void test_qemu_strtosz_metric(void) { - const char *str = "12345k"; + const char *str; int err; const char *endptr; uint64_t res = 0xbaadf00d; + str = "12345k"; err = qemu_strtosz_metric(str, &endptr, &res); g_assert_cmpint(err, ==, 0); g_assert_cmpint(res, ==, 12345000); g_assert(endptr == str + 6); + + str = "12.345M"; + err = qemu_strtosz_metric(str, &endptr, &res); + g_assert_cmpint(err, ==, 0); + g_assert_cmpint(res, ==, 12345000); + g_assert(endptr == str + 7); } int main(int argc, char **argv) @@ -2443,6 +2486,8 @@ int main(int argc, char **argv) g_test_add_func("/cutils/strtosz/simple", test_qemu_strtosz_simple); + g_test_add_func("/cutils/strtosz/hex", + test_qemu_strtosz_hex); g_test_add_func("/cutils/strtosz/units", test_qemu_strtosz_units); g_test_add_func("/cutils/strtosz/float", diff --git a/tests/test-keyval.c b/tests/test-keyval.c index ee927fe4e427..13be763650b2 100644 --- a/tests/test-keyval.c +++ b/tests/test-keyval.c @@ -445,9 +445,9 @@ static void test_keyval_visit_size(void) visit_end_struct(v, NULL); visit_free(v); - /* Note: precision is 53 bits since we're parsing with strtod() */ + /* Note: full 64 bits of precision */ - /* Around limit of precision: 2^53-1, 2^53, 2^53+1 */ + /* Around double limit of precision: 2^53-1, 2^53, 2^53+1 */ qdict = keyval_parse("sz1=9007199254740991," "sz2=9007199254740992," "sz3=9007199254740993", @@ -460,7 +460,7 @@ static void test_keyval_visit_size(void) visit_type_size(v, "sz2", &sz, &error_abort); g_assert_cmphex(sz, ==, 0x20000000000000); visit_type_size(v, "sz3", &sz, &error_abort); - g_assert_cmphex(sz, ==, 0x20000000000000); + g_assert_cmphex(sz, ==, 0x20000000000001); visit_check_struct(v, &error_abort); visit_end_struct(v, NULL); visit_free(v); @@ -475,7 +475,7 @@ static void test_keyval_visit_size(void) visit_type_size(v, "sz1", &sz, &error_abort); g_assert_cmphex(sz, ==, 0x7ffffffffffffc00); visit_type_size(v, "sz2", &sz, &error_abort); - g_assert_cmphex(sz, ==, 0x7ffffffffffffc00); + g_assert_cmphex(sz, ==, 0x7ffffffffffffdff); visit_check_struct(v, &error_abort); visit_end_struct(v, NULL); visit_free(v); @@ -490,14 +490,26 @@ static void test_keyval_visit_size(void) visit_type_size(v, "sz1", &sz, &error_abort); g_assert_cmphex(sz, ==, 0xfffffffffffff800); visit_type_size(v, "sz2", &sz, &error_abort); - g_assert_cmphex(sz, ==, 0xfffffffffffff800); + g_assert_cmphex(sz, ==, 0xfffffffffffffbff); + visit_check_struct(v, &error_abort); + visit_end_struct(v, NULL); + visit_free(v); + + /* Actual limit */ + qdict = keyval_parse("sz1=18446744073709551615", /* ffffffffffffffff */ + NULL, NULL, &error_abort); + v = qobject_input_visitor_new_keyval(QOBJECT(qdict)); + qobject_unref(qdict); + visit_start_struct(v, NULL, NULL, 0, &error_abort); + visit_type_size(v, "sz1", &sz, &error_abort); + g_assert_cmphex(sz, ==, 0xffffffffffffffff); visit_check_struct(v, &error_abort); visit_end_struct(v, NULL); visit_free(v); /* Beyond limits */ qdict = keyval_parse("sz1=-1," - "sz2=18446744073709550592", /* fffffffffffffc00 */ + "sz2=18446744073709551616", /* 2^64 */ NULL, NULL, &error_abort); v = qobject_input_visitor_new_keyval(QOBJECT(qdict)); qobject_unref(qdict); diff --git a/tests/test-qemu-opts.c b/tests/test-qemu-opts.c index 8bbb17b1c726..f79b698e6715 100644 --- a/tests/test-qemu-opts.c +++ b/tests/test-qemu-opts.c @@ -654,9 +654,9 @@ static void test_opts_parse_size(void) g_assert_cmpuint(opts_count(opts), ==, 1); g_assert_cmpuint(qemu_opt_get_size(opts, "size1", 1), ==, 0); - /* Note: precision is 53 bits since we're parsing with strtod() */ + /* Note: full 64 bits of precision */ - /* Around limit of precision: 2^53-1, 2^53, 2^54 */ + /* Around double limit of precision: 2^53-1, 2^53, 2^53+1 */ opts = qemu_opts_parse(&opts_list_02, "size1=9007199254740991," "size2=9007199254740992," @@ -668,7 +668,7 @@ static void test_opts_parse_size(void) g_assert_cmphex(qemu_opt_get_size(opts, "size2", 1), ==, 0x20000000000000); g_assert_cmphex(qemu_opt_get_size(opts, "size3", 1), - ==, 0x20000000000000); + ==, 0x20000000000001); /* Close to signed upper limit 0x7ffffffffffffc00 (53 msbs set) */ opts = qemu_opts_parse(&opts_list_02, @@ -679,7 +679,7 @@ static void test_opts_parse_size(void) g_assert_cmphex(qemu_opt_get_size(opts, "size1", 1), ==, 0x7ffffffffffffc00); g_assert_cmphex(qemu_opt_get_size(opts, "size2", 1), - ==, 0x7ffffffffffffc00); + ==, 0x7ffffffffffffdff); /* Close to actual upper limit 0xfffffffffffff800 (53 msbs set) */ opts = qemu_opts_parse(&opts_list_02, @@ -690,14 +690,22 @@ static void test_opts_parse_size(void) g_assert_cmphex(qemu_opt_get_size(opts, "size1", 1), ==, 0xfffffffffffff800); g_assert_cmphex(qemu_opt_get_size(opts, "size2", 1), - ==, 0xfffffffffffff800); + ==, 0xfffffffffffffbff); + + /* Actual limit */ + opts = qemu_opts_parse(&opts_list_02, + "size1=18446744073709551615", /* ffffffffffffffff */ + false, &error_abort); + g_assert_cmpuint(opts_count(opts), ==, 1); + g_assert_cmphex(qemu_opt_get_size(opts, "size1", 1), + ==, 0xffffffffffffffff); /* Beyond limits */ opts = qemu_opts_parse(&opts_list_02, "size1=-1", false, &err); error_free_or_abort(&err); g_assert(!opts); opts = qemu_opts_parse(&opts_list_02, - "size1=18446744073709550592", /* fffffffffffffc00 */ + "size1=18446744073709551616", /* 2^64 */ false, &err); error_free_or_abort(&err); g_assert(!opts); diff --git a/util/cutils.c b/util/cutils.c index 0b5073b33012..0234763bd70b 100644 --- a/util/cutils.c +++ b/util/cutils.c @@ -241,10 +241,21 @@ static int64_t suffix_mul(char suffix, int64_t unit) } /* - * Convert string to bytes, allowing either B/b for bytes, K/k for KB, - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on - * other error. + * Convert size string to bytes. + * + * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or + * T/t for TB, with scaling based on @unit, and with @default_suffix + * implied if no explicit suffix was given. + * + * The end pointer will be returned in *end, if not NULL. If there is + * no fraction, the input can be decimal or hexadecimal; if there is a + * fraction, then the input must be decimal and there must be a suffix + * (possibly by @default_suffix) larger than Byte, and the fractional + * portion may suffer from precision loss or rounding. The input must + * be positive. + * + * Return -ERANGE on overflow (with *@end advanced), and -EINVAL on + * other error (with *@end left unchanged). */ static int do_strtosz(const char *nptr, const char **end, const char default_suffix, int64_t unit, @@ -253,40 +264,66 @@ static int do_strtosz(const char *nptr, const char **end, int retval; const char *endptr; unsigned char c; - int mul_required = 0; - double val, mul, integral, fraction; + bool mul_required = false; + uint64_t val; + int64_t mul; + double fraction = 0.0; - retval = qemu_strtod_finite(nptr, &endptr, &val); + /* Parse integral portion as decimal. */ + retval = qemu_strtou64(nptr, &endptr, 10, &val); if (retval) { goto out; } - fraction = modf(val, &integral); - if (fraction != 0) { - mul_required = 1; + if (strchr(nptr, '-') != NULL) { + retval = -ERANGE; + goto out; + } + if (val == 0 && (*endptr == 'x' || *endptr == 'X')) { + /* Input looks like hex, reparse, and insist on no fraction. */ + retval = qemu_strtou64(nptr, &endptr, 16, &val); + if (retval) { + goto out; + } + if (*endptr == '.') { + endptr = nptr; + retval = -EINVAL; + goto out; + } + } else if (*endptr == '.') { + /* Input is fractional, insist on 0 <= fraction < 1, with no exponent */ + retval = qemu_strtod_finite(endptr, &endptr, &fraction); + if (retval) { + endptr = nptr; + goto out; + } + if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr) + || memchr(nptr, 'E', endptr - nptr)) { + endptr = nptr; + retval = -EINVAL; + goto out; + } + if (fraction != 0) { + mul_required = true; + } } c = *endptr; mul = suffix_mul(c, unit); - if (mul >= 0) { + if (mul > 0) { endptr++; } else { mul = suffix_mul(default_suffix, unit); - assert(mul >= 0); + assert(mul > 0); } if (mul == 1 && mul_required) { + endptr = nptr; retval = -EINVAL; goto out; } - /* - * Values near UINT64_MAX overflow to 2**64 when converting to double - * precision. Compare against the maximum representable double precision - * value below 2**64, computed as "the next value after 2**64 (0x1p64) in - * the direction of 0". - */ - if ((val * mul > nextafter(0x1p64, 0)) || val < 0) { + if (val > UINT64_MAX / mul) { retval = -ERANGE; goto out; } - *result = val * mul; + *result = val * mul + (uint64_t) (fraction * mul); retval = 0; out: diff --git a/tests/qemu-iotests/049.out b/tests/qemu-iotests/049.out index b1d8fd9107ef..f38d3ccd5978 100644 --- a/tests/qemu-iotests/049.out +++ b/tests/qemu-iotests/049.out @@ -98,10 +98,13 @@ qemu-img create -f qcow2 -o size=-1024 TEST_DIR/t.qcow2 qemu-img: TEST_DIR/t.qcow2: Value '-1024' is out of range for parameter 'size' qemu-img create -f qcow2 TEST_DIR/t.qcow2 -- -1k -qemu-img: Invalid image size specified. Must be between 0 and 9223372036854775807. +qemu-img: Invalid image size specified. You may use k, M, G, T, P or E suffixes for +qemu-img: kilobytes, megabytes, gigabytes, terabytes, petabytes and exabytes. qemu-img create -f qcow2 -o size=-1k TEST_DIR/t.qcow2 -qemu-img: TEST_DIR/t.qcow2: Value '-1k' is out of range for parameter 'size' +qemu-img: TEST_DIR/t.qcow2: Parameter 'size' expects a non-negative number below 2^64 +Optional suffix k, M, G, T, P or E means kilo-, mega-, giga-, tera-, peta- +and exabytes, respectively. qemu-img create -f qcow2 TEST_DIR/t.qcow2 -- 1kilobyte qemu-img: Invalid image size specified. You may use k, M, G, T, P or E suffixes for -- 2.30.0 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision 2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake @ 2021-02-04 20:12 ` Eric Blake 2021-02-05 10:06 ` Vladimir Sementsov-Ogievskiy 2021-02-05 10:07 ` Vladimir Sementsov-Ogievskiy ` (3 subsequent siblings) 4 siblings, 1 reply; 31+ messages in thread From: Eric Blake @ 2021-02-04 20:12 UTC (permalink / raw) To: qemu-devel Cc: Kevin Wolf, vsementsov, berrange, qemu-block, tao3.xu, armbru, Max Reitz On 2/4/21 1:07 PM, Eric Blake wrote: > We have multiple clients of qemu_strtosz (qemu-io, the opts visitor, > the keyval visitor), and it gets annoying that edge-case testing is > impacted by implicit rounding to 53 bits of precision due to parsing > with strtod(). As an example posted by Rich Jones: > $ nbdkit memory $(( 2**63 - 2**30 )) --run \ > 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' > write failed: Input/output error > > because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is > out of bounds. > > It is also worth noting that our existing parser, by virtue of using > strtod(), accepts decimal AND hex numbers, even though test-cutils > previously lacked any coverage of the latter. We do have existing > clients that expect a hex parse to work (for example, iotest 33 using > qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8 > rather than as an invalid octal number, so we know there are no > clients that depend on octal. Our use of strtod() also means that > "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather > than 1843 (if the fraction were 8/10); but as this was not covered in > the testsuite, I have no qualms forbidding hex fractions as invalid, > so this patch declares that the use of fractions is only supported > with decimal input, and enhances the testsuite to document that. > > Our previous use of strtod() meant that -1 parsed as a negative; now > that we parse with strtoull(), negative values can wrap around module modulo > 2^64, so we have to explicitly check whether the user passed in a '-'. > > We also had no testsuite coverage of "1.1e0k", which happened to parse > under strtod() but is unlikely to occur in practice; as long as we are > making things more robust, it is easy enough to reject the use of > exponents in a strtod parse. > > The fix is done by breaking the parse into an integer prefix (no loss > in precision), rejecting negative values (since we can no longer rely > on strtod() to do that), determining if a decimal or hexadecimal parse > was intended (with the new restriction that a fractional hex parse is > not allowed), and where appropriate, using a floating point fractional > parse (where we also scan to reject use of exponents in the fraction). > The bulk of the patch is then updates to the testsuite to match our > new precision, as well as adding new cases we reject (whether they > were rejected or inadvertenly accepted before). inadvertently > > Signed-off-by: Eric Blake <eblake@redhat.com> > > --- > Is it a bad sign when I review my own code? > > /* > - * Convert string to bytes, allowing either B/b for bytes, K/k for KB, > - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned > - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on > - * other error. > + * Convert size string to bytes. > + * > + * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or > + * T/t for TB, with scaling based on @unit, and with @default_suffix > + * implied if no explicit suffix was given. Reformatted existing text, but incomplete; we also support 'P' and 'E'. > + * > + * The end pointer will be returned in *end, if not NULL. If there is > + * no fraction, the input can be decimal or hexadecimal; if there is a > + * fraction, then the input must be decimal and there must be a suffix > + * (possibly by @default_suffix) larger than Byte, and the fractional > + * portion may suffer from precision loss or rounding. The input must > + * be positive. > + * > + * Return -ERANGE on overflow (with *@end advanced), and -EINVAL on > + * other error (with *@end left unchanged). If we take patch 2 and 3, this contract should also be updated to mention what we consider to be deprecated. > */ > static int do_strtosz(const char *nptr, const char **end, > const char default_suffix, int64_t unit, > @@ -253,40 +264,66 @@ static int do_strtosz(const char *nptr, const char **end, > int retval; > const char *endptr; > unsigned char c; > - int mul_required = 0; > - double val, mul, integral, fraction; > + bool mul_required = false; > + uint64_t val; > + int64_t mul; > + double fraction = 0.0; > > - retval = qemu_strtod_finite(nptr, &endptr, &val); > + /* Parse integral portion as decimal. */ > + retval = qemu_strtou64(nptr, &endptr, 10, &val); > if (retval) { > goto out; > } > - fraction = modf(val, &integral); > - if (fraction != 0) { > - mul_required = 1; > + if (strchr(nptr, '-') != NULL) { > + retval = -ERANGE; > + goto out; > + } > + if (val == 0 && (*endptr == 'x' || *endptr == 'X')) { > + /* Input looks like hex, reparse, and insist on no fraction. */ > + retval = qemu_strtou64(nptr, &endptr, 16, &val); > + if (retval) { > + goto out; > + } > + if (*endptr == '.') { > + endptr = nptr; > + retval = -EINVAL; > + goto out; > + } > + } else if (*endptr == '.') { > + /* Input is fractional, insist on 0 <= fraction < 1, with no exponent */ > + retval = qemu_strtod_finite(endptr, &endptr, &fraction); > + if (retval) { > + endptr = nptr; > + goto out; > + } > + if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr) > + || memchr(nptr, 'E', endptr - nptr)) { > + endptr = nptr; > + retval = -EINVAL; > + goto out; > + } > + if (fraction != 0) { > + mul_required = true; > + } > } > c = *endptr; > mul = suffix_mul(c, unit); > - if (mul >= 0) { > + if (mul > 0) { > endptr++; > } else { > mul = suffix_mul(default_suffix, unit); > - assert(mul >= 0); > + assert(mul > 0); > } > if (mul == 1 && mul_required) { > + endptr = nptr; > retval = -EINVAL; > goto out; > } > - /* > - * Values near UINT64_MAX overflow to 2**64 when converting to double > - * precision. Compare against the maximum representable double precision > - * value below 2**64, computed as "the next value after 2**64 (0x1p64) in > - * the direction of 0". > - */ > - if ((val * mul > nextafter(0x1p64, 0)) || val < 0) { > + if (val > UINT64_MAX / mul) { Hmm, do we care about: 15.9999999999999999999999999999E where the fractional portion becomes large enough to actually bump our sum below to 16E which indeed overflows? Then again, we rejected a fraction of 1.0 above, and 0.9999999999999999999999999999 parses to 1.0 due to rounding. Maybe it's just worth a good comment why the overflow check here works without consulting fraction. > retval = -ERANGE; > goto out; > } > - *result = val * mul; > + *result = val * mul + (uint64_t) (fraction * mul); > retval = 0; > > out: -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision 2021-02-04 20:12 ` Eric Blake @ 2021-02-05 10:06 ` Vladimir Sementsov-Ogievskiy 2021-02-05 10:18 ` Vladimir Sementsov-Ogievskiy 2021-02-05 14:06 ` Eric Blake 0 siblings, 2 replies; 31+ messages in thread From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 10:06 UTC (permalink / raw) To: Eric Blake, qemu-devel Cc: Kevin Wolf, berrange, qemu-block, tao3.xu, armbru, Max Reitz 04.02.2021 23:12, Eric Blake wrote: > On 2/4/21 1:07 PM, Eric Blake wrote: >> We have multiple clients of qemu_strtosz (qemu-io, the opts visitor, >> the keyval visitor), and it gets annoying that edge-case testing is >> impacted by implicit rounding to 53 bits of precision due to parsing >> with strtod(). As an example posted by Rich Jones: >> $ nbdkit memory $(( 2**63 - 2**30 )) --run \ >> 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' >> write failed: Input/output error >> >> because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is >> out of bounds. >> >> It is also worth noting that our existing parser, by virtue of using >> strtod(), accepts decimal AND hex numbers, even though test-cutils >> previously lacked any coverage of the latter. We do have existing >> clients that expect a hex parse to work (for example, iotest 33 using >> qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8 >> rather than as an invalid octal number, so we know there are no >> clients that depend on octal. Our use of strtod() also means that >> "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather >> than 1843 (if the fraction were 8/10); but as this was not covered in >> the testsuite, I have no qualms forbidding hex fractions as invalid, >> so this patch declares that the use of fractions is only supported >> with decimal input, and enhances the testsuite to document that. >> >> Our previous use of strtod() meant that -1 parsed as a negative; now >> that we parse with strtoull(), negative values can wrap around module > > modulo > >> 2^64, so we have to explicitly check whether the user passed in a '-'. >> >> We also had no testsuite coverage of "1.1e0k", which happened to parse >> under strtod() but is unlikely to occur in practice; as long as we are >> making things more robust, it is easy enough to reject the use of >> exponents in a strtod parse. >> >> The fix is done by breaking the parse into an integer prefix (no loss >> in precision), rejecting negative values (since we can no longer rely >> on strtod() to do that), determining if a decimal or hexadecimal parse >> was intended (with the new restriction that a fractional hex parse is >> not allowed), and where appropriate, using a floating point fractional >> parse (where we also scan to reject use of exponents in the fraction). >> The bulk of the patch is then updates to the testsuite to match our >> new precision, as well as adding new cases we reject (whether they >> were rejected or inadvertenly accepted before). > > inadvertently > >> >> Signed-off-by: Eric Blake <eblake@redhat.com> >> >> --- >> > > Is it a bad sign when I review my own code? > >> >> /* >> - * Convert string to bytes, allowing either B/b for bytes, K/k for KB, >> - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned >> - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on >> - * other error. >> + * Convert size string to bytes. >> + * >> + * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or >> + * T/t for TB, with scaling based on @unit, and with @default_suffix >> + * implied if no explicit suffix was given. > > Reformatted existing text, but incomplete; we also support 'P' and 'E'. > >> + * >> + * The end pointer will be returned in *end, if not NULL. If there is >> + * no fraction, the input can be decimal or hexadecimal; if there is a >> + * fraction, then the input must be decimal and there must be a suffix >> + * (possibly by @default_suffix) larger than Byte, and the fractional >> + * portion may suffer from precision loss or rounding. The input must >> + * be positive. >> + * >> + * Return -ERANGE on overflow (with *@end advanced), and -EINVAL on >> + * other error (with *@end left unchanged). > > If we take patch 2 and 3, this contract should also be updated to > mention what we consider to be deprecated. > >> */ >> static int do_strtosz(const char *nptr, const char **end, >> const char default_suffix, int64_t unit, >> @@ -253,40 +264,66 @@ static int do_strtosz(const char *nptr, const char **end, >> int retval; >> const char *endptr; >> unsigned char c; >> - int mul_required = 0; >> - double val, mul, integral, fraction; >> + bool mul_required = false; >> + uint64_t val; >> + int64_t mul; >> + double fraction = 0.0; >> >> - retval = qemu_strtod_finite(nptr, &endptr, &val); >> + /* Parse integral portion as decimal. */ >> + retval = qemu_strtou64(nptr, &endptr, 10, &val); >> if (retval) { >> goto out; >> } >> - fraction = modf(val, &integral); >> - if (fraction != 0) { >> - mul_required = 1; >> + if (strchr(nptr, '-') != NULL) { >> + retval = -ERANGE; >> + goto out; >> + } >> + if (val == 0 && (*endptr == 'x' || *endptr == 'X')) { >> + /* Input looks like hex, reparse, and insist on no fraction. */ >> + retval = qemu_strtou64(nptr, &endptr, 16, &val); >> + if (retval) { >> + goto out; >> + } >> + if (*endptr == '.') { >> + endptr = nptr; >> + retval = -EINVAL; >> + goto out; >> + } >> + } else if (*endptr == '.') { >> + /* Input is fractional, insist on 0 <= fraction < 1, with no exponent */ >> + retval = qemu_strtod_finite(endptr, &endptr, &fraction); >> + if (retval) { >> + endptr = nptr; >> + goto out; >> + } >> + if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr) >> + || memchr(nptr, 'E', endptr - nptr)) { >> + endptr = nptr; >> + retval = -EINVAL; >> + goto out; >> + } >> + if (fraction != 0) { >> + mul_required = true; >> + } >> } >> c = *endptr; >> mul = suffix_mul(c, unit); >> - if (mul >= 0) { >> + if (mul > 0) { >> endptr++; >> } else { >> mul = suffix_mul(default_suffix, unit); >> - assert(mul >= 0); >> + assert(mul > 0); >> } >> if (mul == 1 && mul_required) { >> + endptr = nptr; >> retval = -EINVAL; >> goto out; >> } >> - /* >> - * Values near UINT64_MAX overflow to 2**64 when converting to double >> - * precision. Compare against the maximum representable double precision >> - * value below 2**64, computed as "the next value after 2**64 (0x1p64) in >> - * the direction of 0". >> - */ >> - if ((val * mul > nextafter(0x1p64, 0)) || val < 0) { >> + if (val > UINT64_MAX / mul) { > > Hmm, do we care about: > 15.9999999999999999999999999999E > where the fractional portion becomes large enough to actually bump our > sum below to 16E which indeed overflows? Then again, we rejected a > fraction of 1.0 above, and 0.9999999999999999999999999999 parses to 1.0 > due to rounding. > Maybe it's just worth a good comment why the overflow check here works > without consulting fraction. worth a good comment, because I don't follow :) If mul is big enough and fraction=0.5, why val*mul + fraction*mul will not overflow? Also, if we find '.' in the number, why not just reparse the whole number with qemu_strtod_finite? And don't care about 1.0? > >> retval = -ERANGE; >> goto out; >> } >> - *result = val * mul; >> + *result = val * mul + (uint64_t) (fraction * mul); >> retval = 0; >> >> out: -- Best regards, Vladimir ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision 2021-02-05 10:06 ` Vladimir Sementsov-Ogievskiy @ 2021-02-05 10:18 ` Vladimir Sementsov-Ogievskiy 2021-02-05 14:06 ` Eric Blake 1 sibling, 0 replies; 31+ messages in thread From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 10:18 UTC (permalink / raw) To: Eric Blake, qemu-devel Cc: Kevin Wolf, berrange, qemu-block, tao3.xu, armbru, Max Reitz 05.02.2021 13:06, Vladimir Sementsov-Ogievskiy wrote: > 04.02.2021 23:12, Eric Blake wrote: >> On 2/4/21 1:07 PM, Eric Blake wrote: >>> We have multiple clients of qemu_strtosz (qemu-io, the opts visitor, >>> the keyval visitor), and it gets annoying that edge-case testing is >>> impacted by implicit rounding to 53 bits of precision due to parsing >>> with strtod(). As an example posted by Rich Jones: >>> $ nbdkit memory $(( 2**63 - 2**30 )) --run \ >>> 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' >>> write failed: Input/output error >>> >>> because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is >>> out of bounds. >>> >>> It is also worth noting that our existing parser, by virtue of using >>> strtod(), accepts decimal AND hex numbers, even though test-cutils >>> previously lacked any coverage of the latter. We do have existing >>> clients that expect a hex parse to work (for example, iotest 33 using >>> qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8 >>> rather than as an invalid octal number, so we know there are no >>> clients that depend on octal. Our use of strtod() also means that >>> "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather >>> than 1843 (if the fraction were 8/10); but as this was not covered in >>> the testsuite, I have no qualms forbidding hex fractions as invalid, >>> so this patch declares that the use of fractions is only supported >>> with decimal input, and enhances the testsuite to document that. >>> >>> Our previous use of strtod() meant that -1 parsed as a negative; now >>> that we parse with strtoull(), negative values can wrap around module >> >> modulo >> >>> 2^64, so we have to explicitly check whether the user passed in a '-'. >>> >>> We also had no testsuite coverage of "1.1e0k", which happened to parse >>> under strtod() but is unlikely to occur in practice; as long as we are >>> making things more robust, it is easy enough to reject the use of >>> exponents in a strtod parse. >>> >>> The fix is done by breaking the parse into an integer prefix (no loss >>> in precision), rejecting negative values (since we can no longer rely >>> on strtod() to do that), determining if a decimal or hexadecimal parse >>> was intended (with the new restriction that a fractional hex parse is >>> not allowed), and where appropriate, using a floating point fractional >>> parse (where we also scan to reject use of exponents in the fraction). >>> The bulk of the patch is then updates to the testsuite to match our >>> new precision, as well as adding new cases we reject (whether they >>> were rejected or inadvertenly accepted before). >> >> inadvertently >> >>> >>> Signed-off-by: Eric Blake <eblake@redhat.com> >>> >>> --- >>> >> >> Is it a bad sign when I review my own code? >> >>> >>> /* >>> - * Convert string to bytes, allowing either B/b for bytes, K/k for KB, >>> - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned >>> - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on >>> - * other error. >>> + * Convert size string to bytes. >>> + * >>> + * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or >>> + * T/t for TB, with scaling based on @unit, and with @default_suffix >>> + * implied if no explicit suffix was given. >> >> Reformatted existing text, but incomplete; we also support 'P' and 'E'. >> >>> + * >>> + * The end pointer will be returned in *end, if not NULL. If there is >>> + * no fraction, the input can be decimal or hexadecimal; if there is a >>> + * fraction, then the input must be decimal and there must be a suffix >>> + * (possibly by @default_suffix) larger than Byte, and the fractional >>> + * portion may suffer from precision loss or rounding. The input must >>> + * be positive. >>> + * >>> + * Return -ERANGE on overflow (with *@end advanced), and -EINVAL on >>> + * other error (with *@end left unchanged). >> >> If we take patch 2 and 3, this contract should also be updated to >> mention what we consider to be deprecated. >> >>> */ >>> static int do_strtosz(const char *nptr, const char **end, >>> const char default_suffix, int64_t unit, >>> @@ -253,40 +264,66 @@ static int do_strtosz(const char *nptr, const char **end, >>> int retval; >>> const char *endptr; >>> unsigned char c; >>> - int mul_required = 0; >>> - double val, mul, integral, fraction; >>> + bool mul_required = false; >>> + uint64_t val; >>> + int64_t mul; >>> + double fraction = 0.0; >>> >>> - retval = qemu_strtod_finite(nptr, &endptr, &val); >>> + /* Parse integral portion as decimal. */ >>> + retval = qemu_strtou64(nptr, &endptr, 10, &val); >>> if (retval) { >>> goto out; >>> } >>> - fraction = modf(val, &integral); >>> - if (fraction != 0) { >>> - mul_required = 1; >>> + if (strchr(nptr, '-') != NULL) { >>> + retval = -ERANGE; >>> + goto out; >>> + } >>> + if (val == 0 && (*endptr == 'x' || *endptr == 'X')) { >>> + /* Input looks like hex, reparse, and insist on no fraction. */ >>> + retval = qemu_strtou64(nptr, &endptr, 16, &val); >>> + if (retval) { >>> + goto out; >>> + } >>> + if (*endptr == '.') { >>> + endptr = nptr; >>> + retval = -EINVAL; >>> + goto out; >>> + } >>> + } else if (*endptr == '.') { >>> + /* Input is fractional, insist on 0 <= fraction < 1, with no exponent */ >>> + retval = qemu_strtod_finite(endptr, &endptr, &fraction); >>> + if (retval) { >>> + endptr = nptr; >>> + goto out; >>> + } >>> + if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr) >>> + || memchr(nptr, 'E', endptr - nptr)) { >>> + endptr = nptr; >>> + retval = -EINVAL; >>> + goto out; >>> + } >>> + if (fraction != 0) { >>> + mul_required = true; >>> + } >>> } >>> c = *endptr; >>> mul = suffix_mul(c, unit); >>> - if (mul >= 0) { >>> + if (mul > 0) { >>> endptr++; >>> } else { >>> mul = suffix_mul(default_suffix, unit); >>> - assert(mul >= 0); >>> + assert(mul > 0); >>> } >>> if (mul == 1 && mul_required) { >>> + endptr = nptr; >>> retval = -EINVAL; >>> goto out; >>> } >>> - /* >>> - * Values near UINT64_MAX overflow to 2**64 when converting to double >>> - * precision. Compare against the maximum representable double precision >>> - * value below 2**64, computed as "the next value after 2**64 (0x1p64) in >>> - * the direction of 0". >>> - */ >>> - if ((val * mul > nextafter(0x1p64, 0)) || val < 0) { >>> + if (val > UINT64_MAX / mul) { >> >> Hmm, do we care about: >> 15.9999999999999999999999999999E >> where the fractional portion becomes large enough to actually bump our >> sum below to 16E which indeed overflows? Then again, we rejected a >> fraction of 1.0 above, and 0.9999999999999999999999999999 parses to 1.0 >> due to rounding. >> Maybe it's just worth a good comment why the overflow check here works >> without consulting fraction. > > worth a good comment, because I don't follow :) > > If mul is big enough and fraction=0.5, why val*mul + fraction*mul will not overflow? > > Also, if we find '.' in the number, why not just reparse the whole number with qemu_strtod_finite? And don't care about 1.0? Ah understand now, because mul is a power of two, so 2^64 - (val * mul) >= mul. Still, is it possible that (due to float calculations limited precision) some double close to 1 (but still < 1.0) being multiplied to mul will result in something >= mul? Also I failed to understand what you write because I forget about E=exbibyte and thought only about exponent > >> >>> retval = -ERANGE; >>> goto out; >>> } >>> - *result = val * mul; >>> + *result = val * mul + (uint64_t) (fraction * mul); >>> retval = 0; >>> >>> out: > > -- Best regards, Vladimir ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision 2021-02-05 10:06 ` Vladimir Sementsov-Ogievskiy 2021-02-05 10:18 ` Vladimir Sementsov-Ogievskiy @ 2021-02-05 14:06 ` Eric Blake 2021-02-05 14:10 ` Daniel P. Berrangé 1 sibling, 1 reply; 31+ messages in thread From: Eric Blake @ 2021-02-05 14:06 UTC (permalink / raw) To: Vladimir Sementsov-Ogievskiy, qemu-devel Cc: Kevin Wolf, berrange, qemu-block, tao3.xu, armbru, Max Reitz On 2/5/21 4:06 AM, Vladimir Sementsov-Ogievskiy wrote: >>> - /* >>> - * Values near UINT64_MAX overflow to 2**64 when converting to >>> double >>> - * precision. Compare against the maximum representable double >>> precision >>> - * value below 2**64, computed as "the next value after 2**64 >>> (0x1p64) in >>> - * the direction of 0". >>> - */ >>> - if ((val * mul > nextafter(0x1p64, 0)) || val < 0) { >>> + if (val > UINT64_MAX / mul) { >> >> Hmm, do we care about: >> 15.9999999999999999999999999999E >> where the fractional portion becomes large enough to actually bump our >> sum below to 16E which indeed overflows? Then again, we rejected a >> fraction of 1.0 above, and 0.9999999999999999999999999999 parses to 1.0 >> due to rounding. >> Maybe it's just worth a good comment why the overflow check here works >> without consulting fraction. > > worth a good comment, because I don't follow :) > > If mul is big enough and fraction=0.5, why val*mul + fraction*mul will > not overflow? When mul is a power of 2, we know that fraction*mul does not change the number of significant bits, but merely moves the exponent, so starting with fraction < 1.0, we know fraction*mul < mul. But when @unit is 1000, there is indeed a rare possibility that the multiplication will cause an inexact answer that will trigger rounding, so we could end up with fraction*mul == mul. So I'm not yet 100% confident that there is no possible combination where we can't cause an overflow to result in val*mul + (uint64_t)(fraction*mul) resulting in 0 instead of UINT64_MAX, and I think I will have to tighten this code up for v2. > > Also, if we find '.' in the number, why not just reparse the whole > number with qemu_strtod_finite? And don't care about 1.0? Reparsing the whole number loses precision. Since we already have a 64-bit precise integer, why throw it away? > >> >>> retval = -ERANGE; >>> goto out; >>> } >>> - *result = val * mul; >>> + *result = val * mul + (uint64_t) (fraction * mul); >>> retval = 0; >>> >>> out: > > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision 2021-02-05 14:06 ` Eric Blake @ 2021-02-05 14:10 ` Daniel P. Berrangé 0 siblings, 0 replies; 31+ messages in thread From: Daniel P. Berrangé @ 2021-02-05 14:10 UTC (permalink / raw) To: Eric Blake Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-block, tao3.xu, qemu-devel, armbru, Max Reitz On Fri, Feb 05, 2021 at 08:06:53AM -0600, Eric Blake wrote: > On 2/5/21 4:06 AM, Vladimir Sementsov-Ogievskiy wrote: > > >>> - /* > >>> - * Values near UINT64_MAX overflow to 2**64 when converting to > >>> double > >>> - * precision. Compare against the maximum representable double > >>> precision > >>> - * value below 2**64, computed as "the next value after 2**64 > >>> (0x1p64) in > >>> - * the direction of 0". > >>> - */ > >>> - if ((val * mul > nextafter(0x1p64, 0)) || val < 0) { > >>> + if (val > UINT64_MAX / mul) { > >> > >> Hmm, do we care about: > >> 15.9999999999999999999999999999E > >> where the fractional portion becomes large enough to actually bump our > >> sum below to 16E which indeed overflows? Then again, we rejected a > >> fraction of 1.0 above, and 0.9999999999999999999999999999 parses to 1.0 > >> due to rounding. > >> Maybe it's just worth a good comment why the overflow check here works > >> without consulting fraction. > > > > worth a good comment, because I don't follow :) > > > > If mul is big enough and fraction=0.5, why val*mul + fraction*mul will > > not overflow? > > When mul is a power of 2, we know that fraction*mul does not change the > number of significant bits, but merely moves the exponent, so starting > with fraction < 1.0, we know fraction*mul < mul. But when @unit is > 1000, there is indeed a rare possibility that the multiplication will > cause an inexact answer that will trigger rounding, so we could end up > with fraction*mul == mul. So I'm not yet 100% confident that there is > no possible combination where we can't cause an overflow to result in > val*mul + (uint64_t)(fraction*mul) resulting in 0 instead of UINT64_MAX, > and I think I will have to tighten this code up for v2. > > > > > > Also, if we find '.' in the number, why not just reparse the whole > > number with qemu_strtod_finite? And don't care about 1.0? > > Reparsing the whole number loses precision. Since we already have a > 64-bit precise integer, why throw it away? Yep, it isn't acceptable to throw away precision of the non-fractional part of the input IMHO. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision 2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake 2021-02-04 20:12 ` Eric Blake @ 2021-02-05 10:07 ` Vladimir Sementsov-Ogievskiy 2021-02-05 14:12 ` Eric Blake 2021-02-05 10:28 ` Richard W.M. Jones ` (2 subsequent siblings) 4 siblings, 1 reply; 31+ messages in thread From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 10:07 UTC (permalink / raw) To: Eric Blake, qemu-devel Cc: rjones, berrange, armbru, qemu-block, tao3.xu, Kevin Wolf, Max Reitz 04.02.2021 22:07, Eric Blake wrote: > We have multiple clients of qemu_strtosz (qemu-io, the opts visitor, > the keyval visitor), and it gets annoying that edge-case testing is > impacted by implicit rounding to 53 bits of precision due to parsing > with strtod(). As an example posted by Rich Jones: > $ nbdkit memory $(( 2**63 - 2**30 )) --run \ > 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' > write failed: Input/output error > > because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is > out of bounds. > > It is also worth noting that our existing parser, by virtue of using > strtod(), accepts decimal AND hex numbers, even though test-cutils > previously lacked any coverage of the latter. We do have existing > clients that expect a hex parse to work (for example, iotest 33 using > qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8 > rather than as an invalid octal number, so we know there are no > clients that depend on octal. Our use of strtod() also means that > "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather > than 1843 (if the fraction were 8/10); but as this was not covered in > the testsuite, I have no qualms forbidding hex fractions as invalid, > so this patch declares that the use of fractions is only supported > with decimal input, and enhances the testsuite to document that. > > Our previous use of strtod() meant that -1 parsed as a negative; now > that we parse with strtoull(), negative values can wrap around module > 2^64, so we have to explicitly check whether the user passed in a '-'. > > We also had no testsuite coverage of "1.1e0k", which happened to parse > under strtod() but is unlikely to occur in practice; as long as we are > making things more robust, it is easy enough to reject the use of > exponents in a strtod parse. > > The fix is done by breaking the parse into an integer prefix (no loss > in precision), rejecting negative values (since we can no longer rely > on strtod() to do that), determining if a decimal or hexadecimal parse > was intended (with the new restriction that a fractional hex parse is > not allowed), and where appropriate, using a floating point fractional > parse (where we also scan to reject use of exponents in the fraction). > The bulk of the patch is then updates to the testsuite to match our > new precision, as well as adding new cases we reject (whether they > were rejected or inadvertenly accepted before). > > Signed-off-by: Eric Blake <eblake@redhat.com> > > --- > > Note that this approach differs from what has been attempted in the > past; see the thread starting at > https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg00852.html > That approach tried to parse both as strtoull and strtod and take > whichever was longer, but that was harder to document. > --- > tests/test-cutils.c | 79 ++++++++++++++++++++++++++++++-------- > tests/test-keyval.c | 24 +++++++++--- > tests/test-qemu-opts.c | 20 +++++++--- > util/cutils.c | 77 +++++++++++++++++++++++++++---------- > tests/qemu-iotests/049.out | 7 +++- > 5 files changed, 156 insertions(+), 51 deletions(-) > > diff --git a/tests/test-cutils.c b/tests/test-cutils.c > index 1aa8351520ae..0c2c89d6f113 100644 > --- a/tests/test-cutils.c > +++ b/tests/test-cutils.c > @@ -1960,6 +1960,13 @@ static void test_qemu_strtosz_simple(void) > g_assert_cmpint(res, ==, 0); > g_assert(endptr == str + 1); > > + /* Leading 0 gives decimal results, not octal */ > + str = "08"; > + err = qemu_strtosz(str, &endptr, &res); > + g_assert_cmpint(err, ==, 0); > + g_assert_cmpint(res, ==, 8); > + g_assert(endptr == str + 2); > + > str = "12345"; > err = qemu_strtosz(str, &endptr, &res); > g_assert_cmpint(err, ==, 0); > @@ -1970,7 +1977,7 @@ static void test_qemu_strtosz_simple(void) > g_assert_cmpint(err, ==, 0); > g_assert_cmpint(res, ==, 12345); > > - /* Note: precision is 53 bits since we're parsing with strtod() */ > + /* Note: If there is no '.', we get full 64 bits of precision */ > > str = "9007199254740991"; /* 2^53-1 */ > err = qemu_strtosz(str, &endptr, &res); > @@ -1987,7 +1994,7 @@ static void test_qemu_strtosz_simple(void) > str = "9007199254740993"; /* 2^53+1 */ > err = qemu_strtosz(str, &endptr, &res); > g_assert_cmpint(err, ==, 0); > - g_assert_cmpint(res, ==, 0x20000000000000); /* rounded to 53 bits */ > + g_assert_cmpint(res, ==, 0x20000000000001); > g_assert(endptr == str + 16); > > str = "18446744073709549568"; /* 0xfffffffffffff800 (53 msbs set) */ > @@ -1999,11 +2006,35 @@ static void test_qemu_strtosz_simple(void) > str = "18446744073709550591"; /* 0xfffffffffffffbff */ > err = qemu_strtosz(str, &endptr, &res); > g_assert_cmpint(err, ==, 0); > - g_assert_cmpint(res, ==, 0xfffffffffffff800); /* rounded to 53 bits */ > + g_assert_cmpint(res, ==, 0xfffffffffffffbff); > g_assert(endptr == str + 20); > > - /* 0x7ffffffffffffe00..0x7fffffffffffffff get rounded to > - * 0x8000000000000000, thus -ERANGE; see test_qemu_strtosz_erange() */ > + str = "18446744073709551615"; /* 0xffffffffffffffff */ > + err = qemu_strtosz(str, &endptr, &res); > + g_assert_cmpint(err, ==, 0); > + g_assert_cmpint(res, ==, 0xffffffffffffffff); > + g_assert(endptr == str + 20); > + > +} > + > +static void test_qemu_strtosz_hex(void) > +{ > + const char *str; > + const char *endptr; > + int err; > + uint64_t res = 0xbaadf00d; > + > + str = "0x0"; > + err = qemu_strtosz(str, &endptr, &res); > + g_assert_cmpint(err, ==, 0); > + g_assert_cmpint(res, ==, 0); > + g_assert(endptr == str + 3); > + > + str = "0xa"; > + err = qemu_strtosz(str, &endptr, &res); > + g_assert_cmpint(err, ==, 0); > + g_assert_cmpint(res, ==, 10); > + g_assert(endptr == str + 3); > } > > static void test_qemu_strtosz_units(void) > @@ -2106,6 +2137,21 @@ static void test_qemu_strtosz_invalid(void) > err = qemu_strtosz(str, &endptr, &res); > g_assert_cmpint(err, ==, -EINVAL); > g_assert(endptr == str); > + > + str = "1.1e5"; > + err = qemu_strtosz(str, &endptr, &res); > + g_assert_cmpint(err, ==, -EINVAL); > + g_assert(endptr == str); I'd add also test with 'E' > + > + str = "1.1B"; > + err = qemu_strtosz(str, &endptr, &res); > + g_assert_cmpint(err, ==, -EINVAL); > + g_assert(endptr == str); > + > + str = "0x1.8k"; > + err = qemu_strtosz(str, &endptr, &res); > + g_assert_cmpint(err, ==, -EINVAL); > + g_assert(endptr == str); ha this function looks like it should have const cher *str[] = ["", " \t ", ... "0x1.8k"] and all cases in a for()... and all other test cases may be ... Oh, I should say myself "don't refactor everything you see". > } > > static void test_qemu_strtosz_trailing(void) > @@ -2145,17 +2191,7 @@ static void test_qemu_strtosz_erange(void) > g_assert_cmpint(err, ==, -ERANGE); > g_assert(endptr == str + 2); > > - str = "18446744073709550592"; /* 0xfffffffffffffc00 */ > - err = qemu_strtosz(str, &endptr, &res); > - g_assert_cmpint(err, ==, -ERANGE); > - g_assert(endptr == str + 20); > - > - str = "18446744073709551615"; /* 2^64-1 */ > - err = qemu_strtosz(str, &endptr, &res); > - g_assert_cmpint(err, ==, -ERANGE); > - g_assert(endptr == str + 20); > - > - str = "18446744073709551616"; /* 2^64 */ > + str = "18446744073709551616"; /* 2^64; see strtosz_simple for 2^64-1 */ > err = qemu_strtosz(str, &endptr, &res); > g_assert_cmpint(err, ==, -ERANGE); > g_assert(endptr == str + 20); > @@ -2168,15 +2204,22 @@ static void test_qemu_strtosz_erange(void) > > static void test_qemu_strtosz_metric(void) > { > - const char *str = "12345k"; > + const char *str; > int err; > const char *endptr; > uint64_t res = 0xbaadf00d; > > + str = "12345k"; > err = qemu_strtosz_metric(str, &endptr, &res); > g_assert_cmpint(err, ==, 0); > g_assert_cmpint(res, ==, 12345000); > g_assert(endptr == str + 6); > + > + str = "12.345M"; > + err = qemu_strtosz_metric(str, &endptr, &res); > + g_assert_cmpint(err, ==, 0); > + g_assert_cmpint(res, ==, 12345000); > + g_assert(endptr == str + 7); > } > > int main(int argc, char **argv) > @@ -2443,6 +2486,8 @@ int main(int argc, char **argv) > > g_test_add_func("/cutils/strtosz/simple", > test_qemu_strtosz_simple); > + g_test_add_func("/cutils/strtosz/hex", > + test_qemu_strtosz_hex); > g_test_add_func("/cutils/strtosz/units", > test_qemu_strtosz_units); > g_test_add_func("/cutils/strtosz/float", > diff --git a/tests/test-keyval.c b/tests/test-keyval.c > index ee927fe4e427..13be763650b2 100644 > --- a/tests/test-keyval.c > +++ b/tests/test-keyval.c > @@ -445,9 +445,9 @@ static void test_keyval_visit_size(void) > visit_end_struct(v, NULL); > visit_free(v); > > - /* Note: precision is 53 bits since we're parsing with strtod() */ > + /* Note: full 64 bits of precision */ > > - /* Around limit of precision: 2^53-1, 2^53, 2^53+1 */ > + /* Around double limit of precision: 2^53-1, 2^53, 2^53+1 */ > qdict = keyval_parse("sz1=9007199254740991," > "sz2=9007199254740992," > "sz3=9007199254740993", > @@ -460,7 +460,7 @@ static void test_keyval_visit_size(void) > visit_type_size(v, "sz2", &sz, &error_abort); > g_assert_cmphex(sz, ==, 0x20000000000000); > visit_type_size(v, "sz3", &sz, &error_abort); > - g_assert_cmphex(sz, ==, 0x20000000000000); > + g_assert_cmphex(sz, ==, 0x20000000000001); > visit_check_struct(v, &error_abort); > visit_end_struct(v, NULL); > visit_free(v); > @@ -475,7 +475,7 @@ static void test_keyval_visit_size(void) > visit_type_size(v, "sz1", &sz, &error_abort); > g_assert_cmphex(sz, ==, 0x7ffffffffffffc00); > visit_type_size(v, "sz2", &sz, &error_abort); > - g_assert_cmphex(sz, ==, 0x7ffffffffffffc00); > + g_assert_cmphex(sz, ==, 0x7ffffffffffffdff); > visit_check_struct(v, &error_abort); > visit_end_struct(v, NULL); > visit_free(v); > @@ -490,14 +490,26 @@ static void test_keyval_visit_size(void) > visit_type_size(v, "sz1", &sz, &error_abort); > g_assert_cmphex(sz, ==, 0xfffffffffffff800); > visit_type_size(v, "sz2", &sz, &error_abort); > - g_assert_cmphex(sz, ==, 0xfffffffffffff800); > + g_assert_cmphex(sz, ==, 0xfffffffffffffbff); > + visit_check_struct(v, &error_abort); > + visit_end_struct(v, NULL); > + visit_free(v); > + > + /* Actual limit */ > + qdict = keyval_parse("sz1=18446744073709551615", /* ffffffffffffffff */ > + NULL, NULL, &error_abort); > + v = qobject_input_visitor_new_keyval(QOBJECT(qdict)); > + qobject_unref(qdict); > + visit_start_struct(v, NULL, NULL, 0, &error_abort); > + visit_type_size(v, "sz1", &sz, &error_abort); > + g_assert_cmphex(sz, ==, 0xffffffffffffffff); > visit_check_struct(v, &error_abort); > visit_end_struct(v, NULL); > visit_free(v); > > /* Beyond limits */ > qdict = keyval_parse("sz1=-1," > - "sz2=18446744073709550592", /* fffffffffffffc00 */ > + "sz2=18446744073709551616", /* 2^64 */ > NULL, NULL, &error_abort); > v = qobject_input_visitor_new_keyval(QOBJECT(qdict)); > qobject_unref(qdict); > diff --git a/tests/test-qemu-opts.c b/tests/test-qemu-opts.c > index 8bbb17b1c726..f79b698e6715 100644 > --- a/tests/test-qemu-opts.c > +++ b/tests/test-qemu-opts.c > @@ -654,9 +654,9 @@ static void test_opts_parse_size(void) > g_assert_cmpuint(opts_count(opts), ==, 1); > g_assert_cmpuint(qemu_opt_get_size(opts, "size1", 1), ==, 0); > > - /* Note: precision is 53 bits since we're parsing with strtod() */ > + /* Note: full 64 bits of precision */ > > - /* Around limit of precision: 2^53-1, 2^53, 2^54 */ > + /* Around double limit of precision: 2^53-1, 2^53, 2^53+1 */ > opts = qemu_opts_parse(&opts_list_02, > "size1=9007199254740991," > "size2=9007199254740992," > @@ -668,7 +668,7 @@ static void test_opts_parse_size(void) > g_assert_cmphex(qemu_opt_get_size(opts, "size2", 1), > ==, 0x20000000000000); > g_assert_cmphex(qemu_opt_get_size(opts, "size3", 1), > - ==, 0x20000000000000); > + ==, 0x20000000000001); > > /* Close to signed upper limit 0x7ffffffffffffc00 (53 msbs set) */ should fix comment? > opts = qemu_opts_parse(&opts_list_02, > @@ -679,7 +679,7 @@ static void test_opts_parse_size(void) > g_assert_cmphex(qemu_opt_get_size(opts, "size1", 1), > ==, 0x7ffffffffffffc00); > g_assert_cmphex(qemu_opt_get_size(opts, "size2", 1), > - ==, 0x7ffffffffffffc00); > + ==, 0x7ffffffffffffdff); > > /* Close to actual upper limit 0xfffffffffffff800 (53 msbs set) */ and here? and probably in a previous file > opts = qemu_opts_parse(&opts_list_02, > @@ -690,14 +690,22 @@ static void test_opts_parse_size(void) > g_assert_cmphex(qemu_opt_get_size(opts, "size1", 1), > ==, 0xfffffffffffff800); > g_assert_cmphex(qemu_opt_get_size(opts, "size2", 1), > - ==, 0xfffffffffffff800); > + ==, 0xfffffffffffffbff); > + > + /* Actual limit */ > + opts = qemu_opts_parse(&opts_list_02, > + "size1=18446744073709551615", /* ffffffffffffffff */ > + false, &error_abort); > + g_assert_cmpuint(opts_count(opts), ==, 1); > + g_assert_cmphex(qemu_opt_get_size(opts, "size1", 1), > + ==, 0xffffffffffffffff); > > /* Beyond limits */ > opts = qemu_opts_parse(&opts_list_02, "size1=-1", false, &err); > error_free_or_abort(&err); > g_assert(!opts); > opts = qemu_opts_parse(&opts_list_02, > - "size1=18446744073709550592", /* fffffffffffffc00 */ > + "size1=18446744073709551616", /* 2^64 */ > false, &err); > error_free_or_abort(&err); > g_assert(!opts); Test modifications above all looks OK. > diff --git a/util/cutils.c b/util/cutils.c > index 0b5073b33012..0234763bd70b 100644 > --- a/util/cutils.c > +++ b/util/cutils.c > @@ -241,10 +241,21 @@ static int64_t suffix_mul(char suffix, int64_t unit) > } > > /* > - * Convert string to bytes, allowing either B/b for bytes, K/k for KB, > - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned > - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on > - * other error. > + * Convert size string to bytes. > + * > + * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or > + * T/t for TB, with scaling based on @unit, and with @default_suffix > + * implied if no explicit suffix was given. > + * > + * The end pointer will be returned in *end, if not NULL. If there is > + * no fraction, the input can be decimal or hexadecimal; if there is a > + * fraction, then the input must be decimal and there must be a suffix > + * (possibly by @default_suffix) larger than Byte, and the fractional > + * portion may suffer from precision loss or rounding. The input must > + * be positive. > + * > + * Return -ERANGE on overflow (with *@end advanced), and -EINVAL on > + * other error (with *@end left unchanged). > */ > static int do_strtosz(const char *nptr, const char **end, > const char default_suffix, int64_t unit, > @@ -253,40 +264,66 @@ static int do_strtosz(const char *nptr, const char **end, > int retval; > const char *endptr; > unsigned char c; > - int mul_required = 0; > - double val, mul, integral, fraction; > + bool mul_required = false; > + uint64_t val; > + int64_t mul; > + double fraction = 0.0; > > - retval = qemu_strtod_finite(nptr, &endptr, &val); > + /* Parse integral portion as decimal. */ > + retval = qemu_strtou64(nptr, &endptr, 10, &val); > if (retval) { > goto out; > } > - fraction = modf(val, &integral); > - if (fraction != 0) { > - mul_required = 1; > + if (strchr(nptr, '-') != NULL) { > + retval = -ERANGE; > + goto out; > + } Hmm, I have a question: does do_strtosz assumes that the whole nptr string is a number? If yes, then I don't understand what the matter of **end OUT-argument. If not, this if() is wrong, as you'll redject parsing first number of "123425 - 2323" string.. > + if (val == 0 && (*endptr == 'x' || *endptr == 'X')) { > + /* Input looks like hex, reparse, and insist on no fraction. */ > + retval = qemu_strtou64(nptr, &endptr, 16, &val); > + if (retval) { > + goto out; > + } > + if (*endptr == '.') { > + endptr = nptr; > + retval = -EINVAL; > + goto out; > + } > + } else if (*endptr == '.') { > + /* Input is fractional, insist on 0 <= fraction < 1, with no exponent */ > + retval = qemu_strtod_finite(endptr, &endptr, &fraction); > + if (retval) { > + endptr = nptr; > + goto out; > + } > + if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr) > + || memchr(nptr, 'E', endptr - nptr)) { > + endptr = nptr; > + retval = -EINVAL; > + goto out; > + } > + if (fraction != 0) { > + mul_required = true; > + } > } > c = *endptr; > mul = suffix_mul(c, unit); > - if (mul >= 0) { > + if (mul > 0) { > endptr++; > } else { > mul = suffix_mul(default_suffix, unit); > - assert(mul >= 0); > + assert(mul > 0); > } > if (mul == 1 && mul_required) { > + endptr = nptr; > retval = -EINVAL; > goto out; > } > - /* > - * Values near UINT64_MAX overflow to 2**64 when converting to double > - * precision. Compare against the maximum representable double precision > - * value below 2**64, computed as "the next value after 2**64 (0x1p64) in > - * the direction of 0". > - */ > - if ((val * mul > nextafter(0x1p64, 0)) || val < 0) { > + if (val > UINT64_MAX / mul) { > retval = -ERANGE; > goto out; > } > - *result = val * mul; > + *result = val * mul + (uint64_t) (fraction * mul); > retval = 0; > > out: > diff --git a/tests/qemu-iotests/049.out b/tests/qemu-iotests/049.out > index b1d8fd9107ef..f38d3ccd5978 100644 > --- a/tests/qemu-iotests/049.out > +++ b/tests/qemu-iotests/049.out > @@ -98,10 +98,13 @@ qemu-img create -f qcow2 -o size=-1024 TEST_DIR/t.qcow2 > qemu-img: TEST_DIR/t.qcow2: Value '-1024' is out of range for parameter 'size' > > qemu-img create -f qcow2 TEST_DIR/t.qcow2 -- -1k > -qemu-img: Invalid image size specified. Must be between 0 and 9223372036854775807. > +qemu-img: Invalid image size specified. You may use k, M, G, T, P or E suffixes for > +qemu-img: kilobytes, megabytes, gigabytes, terabytes, petabytes and exabytes. > > qemu-img create -f qcow2 -o size=-1k TEST_DIR/t.qcow2 > -qemu-img: TEST_DIR/t.qcow2: Value '-1k' is out of range for parameter 'size' > +qemu-img: TEST_DIR/t.qcow2: Parameter 'size' expects a non-negative number below 2^64 > +Optional suffix k, M, G, T, P or E means kilo-, mega-, giga-, tera-, peta- > +and exabytes, respectively. > > qemu-img create -f qcow2 TEST_DIR/t.qcow2 -- 1kilobyte > qemu-img: Invalid image size specified. You may use k, M, G, T, P or E suffixes for > -- Best regards, Vladimir ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision 2021-02-05 10:07 ` Vladimir Sementsov-Ogievskiy @ 2021-02-05 14:12 ` Eric Blake 0 siblings, 0 replies; 31+ messages in thread From: Eric Blake @ 2021-02-05 14:12 UTC (permalink / raw) To: Vladimir Sementsov-Ogievskiy, qemu-devel Cc: Kevin Wolf, berrange, qemu-block, tao3.xu, rjones, armbru, Max Reitz On 2/5/21 4:07 AM, Vladimir Sementsov-Ogievskiy wrote: >> @@ -2106,6 +2137,21 @@ static void test_qemu_strtosz_invalid(void) >> err = qemu_strtosz(str, &endptr, &res); >> g_assert_cmpint(err, ==, -EINVAL); >> g_assert(endptr == str); >> + >> + str = "1.1e5"; >> + err = qemu_strtosz(str, &endptr, &res); >> + g_assert_cmpint(err, ==, -EINVAL); >> + g_assert(endptr == str); > > I'd add also test with 'E' Will do. For v2, I'll probably split this patch, first into adding new test cases (with demonstrations of what we deem to be buggy parses), and the second showing how those cases improve as we change the code. > >> + >> + str = "1.1B"; >> + err = qemu_strtosz(str, &endptr, &res); >> + g_assert_cmpint(err, ==, -EINVAL); >> + g_assert(endptr == str); >> + >> + str = "0x1.8k"; >> + err = qemu_strtosz(str, &endptr, &res); >> + g_assert_cmpint(err, ==, -EINVAL); >> + g_assert(endptr == str); > > ha this function looks like it should have > > const cher *str[] = ["", " \t ", ... "0x1.8k"] > > and all cases in a for()... and all other test cases may be ... Oh, I > should say myself "don't refactor everything you see". I'll think about it. It's already odd that the tests are split between so many functions. >> @@ -668,7 +668,7 @@ static void test_opts_parse_size(void) >> g_assert_cmphex(qemu_opt_get_size(opts, "size2", 1), >> ==, 0x20000000000000); >> g_assert_cmphex(qemu_opt_get_size(opts, "size3", 1), >> - ==, 0x20000000000000); >> + ==, 0x20000000000001); >> >> /* Close to signed upper limit 0x7ffffffffffffc00 (53 msbs set) */ > > should fix comment? Yes, I did it in one file but not the other, so I'll make it consistent. The point of the comment post-patch is that we are demonstrating that our implementation is NOT bound by the limits of a double's precision. >> - retval = qemu_strtod_finite(nptr, &endptr, &val); >> + /* Parse integral portion as decimal. */ >> + retval = qemu_strtou64(nptr, &endptr, 10, &val); >> if (retval) { >> goto out; >> } >> - fraction = modf(val, &integral); >> - if (fraction != 0) { >> - mul_required = 1; >> + if (strchr(nptr, '-') != NULL) { >> + retval = -ERANGE; >> + goto out; >> + } > > Hmm, I have a question: does do_strtosz assumes that the whole nptr > string is a number? No. In fact, our testsuite demonstrates the difference between endptr as a pointer (we parse what we can and advance *endptr to the tail) and as NULL (trailing garbage is an error). > > If yes, then I don't understand what the matter of **end OUT-argument. > > If not, this if() is wrong, as you'll redject parsing first number of > "123425 - 2323" string.. Yep, good catch. This needs to use memchr, similar to the check for 'e' in the fraction portion below. >> + retval = qemu_strtod_finite(endptr, &endptr, &fraction); >> + if (retval) { >> + endptr = nptr; >> + goto out; >> + } >> + if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr) >> + || memchr(nptr, 'E', endptr - nptr)) { >> + endptr = nptr; -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision 2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake 2021-02-04 20:12 ` Eric Blake 2021-02-05 10:07 ` Vladimir Sementsov-Ogievskiy @ 2021-02-05 10:28 ` Richard W.M. Jones 2021-02-05 14:15 ` Eric Blake 2021-02-05 11:02 ` Daniel P. Berrangé 2021-02-05 11:34 ` Daniel P. Berrangé 4 siblings, 1 reply; 31+ messages in thread From: Richard W.M. Jones @ 2021-02-05 10:28 UTC (permalink / raw) To: Eric Blake Cc: Kevin Wolf, vsementsov, berrange, qemu-block, tao3.xu, qemu-devel, armbru, Max Reitz On Thu, Feb 04, 2021 at 01:07:06PM -0600, Eric Blake wrote: > We have multiple clients of qemu_strtosz (qemu-io, the opts visitor, > the keyval visitor), and it gets annoying that edge-case testing is > impacted by implicit rounding to 53 bits of precision due to parsing > with strtod(). As an example posted by Rich Jones: > $ nbdkit memory $(( 2**63 - 2**30 )) --run \ > 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' > write failed: Input/output error > > because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is > out of bounds. > > It is also worth noting that our existing parser, by virtue of using > strtod(), accepts decimal AND hex numbers, even though test-cutils > previously lacked any coverage of the latter. We do have existing > clients that expect a hex parse to work (for example, iotest 33 using > qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8 > rather than as an invalid octal number, so we know there are no > clients that depend on octal. Our use of strtod() also means that > "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather > than 1843 (if the fraction were 8/10); but as this was not covered in > the testsuite, I have no qualms forbidding hex fractions as invalid, > so this patch declares that the use of fractions is only supported > with decimal input, and enhances the testsuite to document that. > > Our previous use of strtod() meant that -1 parsed as a negative; now > that we parse with strtoull(), negative values can wrap around module ^^ modulo The patch looked fine to me although Vladimir found some problems which I didn't spot. I have a question: What happens with leading or trailing whitespace? Is that ignored, rejected or impossible? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision 2021-02-05 10:28 ` Richard W.M. Jones @ 2021-02-05 14:15 ` Eric Blake 0 siblings, 0 replies; 31+ messages in thread From: Eric Blake @ 2021-02-05 14:15 UTC (permalink / raw) To: Richard W.M. Jones Cc: Kevin Wolf, vsementsov, berrange, qemu-block, tao3.xu, qemu-devel, armbru, Max Reitz On 2/5/21 4:28 AM, Richard W.M. Jones wrote: > On Thu, Feb 04, 2021 at 01:07:06PM -0600, Eric Blake wrote: >> We have multiple clients of qemu_strtosz (qemu-io, the opts visitor, >> the keyval visitor), and it gets annoying that edge-case testing is >> impacted by implicit rounding to 53 bits of precision due to parsing >> with strtod(). As an example posted by Rich Jones: >> $ nbdkit memory $(( 2**63 - 2**30 )) --run \ >> 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' >> write failed: Input/output error >> >> because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is >> out of bounds. >> >> It is also worth noting that our existing parser, by virtue of using >> strtod(), accepts decimal AND hex numbers, even though test-cutils >> previously lacked any coverage of the latter. We do have existing >> clients that expect a hex parse to work (for example, iotest 33 using >> qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8 >> rather than as an invalid octal number, so we know there are no >> clients that depend on octal. Our use of strtod() also means that >> "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather >> than 1843 (if the fraction were 8/10); but as this was not covered in >> the testsuite, I have no qualms forbidding hex fractions as invalid, >> so this patch declares that the use of fractions is only supported >> with decimal input, and enhances the testsuite to document that. >> >> Our previous use of strtod() meant that -1 parsed as a negative; now >> that we parse with strtoull(), negative values can wrap around module > > ^^ modulo > > The patch looked fine to me although Vladimir found some problems > which I didn't spot. I have a question: What happens with leading or > trailing whitespace? Is that ignored, rejected or impossible? leading whitespace: ignored (because both strtod() pre-patch, and now strtoull() post-patch, do so for free). And that is why we have to memchr() (and not strchr(), as pointed out by Vladimir) for a '-' sign, because merely checking *nptr=='-' would be wrong in the presence of leading space. trailing whitespace: treated the same as any other trailing garbage (again, what strtod() and strtoull() give you for free). If endptr was non-NULL, then *endptr now points to that trailing space; if it was NULL, the parse is rejected because of the trailing garbage. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision 2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake ` (2 preceding siblings ...) 2021-02-05 10:28 ` Richard W.M. Jones @ 2021-02-05 11:02 ` Daniel P. Berrangé 2021-02-05 14:27 ` Eric Blake 2021-02-05 11:34 ` Daniel P. Berrangé 4 siblings, 1 reply; 31+ messages in thread From: Daniel P. Berrangé @ 2021-02-05 11:02 UTC (permalink / raw) To: Eric Blake Cc: Kevin Wolf, vsementsov, qemu-block, rjones, tao3.xu, armbru, qemu-devel, Max Reitz On Thu, Feb 04, 2021 at 01:07:06PM -0600, Eric Blake wrote: > We have multiple clients of qemu_strtosz (qemu-io, the opts visitor, > the keyval visitor), and it gets annoying that edge-case testing is > impacted by implicit rounding to 53 bits of precision due to parsing > with strtod(). As an example posted by Rich Jones: > $ nbdkit memory $(( 2**63 - 2**30 )) --run \ > 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' > write failed: Input/output error > > because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is > out of bounds. > > It is also worth noting that our existing parser, by virtue of using > strtod(), accepts decimal AND hex numbers, even though test-cutils > previously lacked any coverage of the latter. We do have existing > clients that expect a hex parse to work (for example, iotest 33 using > qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8 > rather than as an invalid octal number, so we know there are no > clients that depend on octal. Our use of strtod() also means that > "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather > than 1843 (if the fraction were 8/10); but as this was not covered in > the testsuite, I have no qualms forbidding hex fractions as invalid, > so this patch declares that the use of fractions is only supported > with decimal input, and enhances the testsuite to document that. > > Our previous use of strtod() meant that -1 parsed as a negative; now > that we parse with strtoull(), negative values can wrap around module > 2^64, so we have to explicitly check whether the user passed in a '-'. > > We also had no testsuite coverage of "1.1e0k", which happened to parse > under strtod() but is unlikely to occur in practice; as long as we are > making things more robust, it is easy enough to reject the use of > exponents in a strtod parse. > > The fix is done by breaking the parse into an integer prefix (no loss > in precision), rejecting negative values (since we can no longer rely > on strtod() to do that), determining if a decimal or hexadecimal parse > was intended (with the new restriction that a fractional hex parse is > not allowed), and where appropriate, using a floating point fractional > parse (where we also scan to reject use of exponents in the fraction). > The bulk of the patch is then updates to the testsuite to match our > new precision, as well as adding new cases we reject (whether they > were rejected or inadvertenly accepted before). > > Signed-off-by: Eric Blake <eblake@redhat.com> > > diff --git a/util/cutils.c b/util/cutils.c > index 0b5073b33012..0234763bd70b 100644 > --- a/util/cutils.c > +++ b/util/cutils.c > @@ -241,10 +241,21 @@ static int64_t suffix_mul(char suffix, int64_t unit) > } > > /* > - * Convert string to bytes, allowing either B/b for bytes, K/k for KB, > - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned > - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on > - * other error. > + * Convert size string to bytes. > + * > + * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or > + * T/t for TB, with scaling based on @unit, and with @default_suffix > + * implied if no explicit suffix was given. > + * > + * The end pointer will be returned in *end, if not NULL. If there is > + * no fraction, the input can be decimal or hexadecimal; if there is a > + * fraction, then the input must be decimal and there must be a suffix > + * (possibly by @default_suffix) larger than Byte, and the fractional > + * portion may suffer from precision loss or rounding. The input must > + * be positive. Even though the test suite gives some illustrations, I think we should document here the patterns we're intending to support. IIUC, we aim for [quote] The size parsing supports the following syntaxes - 12345 - decimal, bytes - 12345{bBkKmMgGtT} - decimal, scaled bytes - 12345.678 - fractional decimal, bytes - 12345.678{bBkKmMgGtT} - fractional decimal, scaled bytes - 0x7FEE - hex, bytes The following are intentionally not supported - octal - fractional hex - floating point exponents [/quote] > + * > + * Return -ERANGE on overflow (with *@end advanced), and -EINVAL on > + * other error (with *@end left unchanged). > */ Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision 2021-02-05 11:02 ` Daniel P. Berrangé @ 2021-02-05 14:27 ` Eric Blake 2021-02-05 14:36 ` Daniel P. Berrangé 0 siblings, 1 reply; 31+ messages in thread From: Eric Blake @ 2021-02-05 14:27 UTC (permalink / raw) To: Daniel P. Berrangé Cc: Kevin Wolf, vsementsov, qemu-block, rjones, tao3.xu, armbru, qemu-devel, Max Reitz On 2/5/21 5:02 AM, Daniel P. Berrangé wrote: >> /* >> - * Convert string to bytes, allowing either B/b for bytes, K/k for KB, >> - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned >> - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on >> - * other error. >> + * Convert size string to bytes. >> + * >> + * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or >> + * T/t for TB, with scaling based on @unit, and with @default_suffix >> + * implied if no explicit suffix was given. >> + * >> + * The end pointer will be returned in *end, if not NULL. If there is >> + * no fraction, the input can be decimal or hexadecimal; if there is a >> + * fraction, then the input must be decimal and there must be a suffix >> + * (possibly by @default_suffix) larger than Byte, and the fractional >> + * portion may suffer from precision loss or rounding. The input must >> + * be positive. > > Even though the test suite gives some illustrations, I think we should > document here the patterns we're intending to support. IIUC, we aim for > > [quote] > The size parsing supports the following syntaxes > > - 12345 - decimal, bytes > - 12345{bBkKmMgGtT} - decimal, scaled bytes Yes. Actually 12345{bBkKmMgGtTpPeE}, as long as it doesn't overflow 16E. > - 12345.678 - fractional decimal, bytes No - this was already rejected prior to my patch, and my patch keeps it as rejected (that's the whole mul_required bool, which checks whether mul > 1). > - 12345.678{bBkKmMgGtT} - fractional decimal, scaled bytes Close; we reject bB in this situation for the same reason that we reject fractional decimal without suffix. Also, patch 3/3 was questioning whether the fractional portion must be exact or is permitted to round > - 0x7FEE - hex, bytes Yes, and with the additional note that 'E' and 'B' are treated as hex digits, not scale suffixes, in this form. > > The following are intentionally not supported > > - octal Never has worked. > - fractional hex worked by accident, dropping it in this patch is not deemed worth a deprecation period. > - floating point exponents worked by accident, dropping it in this patch is not deemed worth a deprecation period. > [/quote] and with one more form: patch 2/3 - 0x123abc{kKmMgGtTpP} - hex, scaled bytes, with limited set of working suffixes, and slated for future removal (this one is barely plausible enough that we need the deprecation period to see who uses it) At any rate, I like the idea of being more explicit in the function description, so I will enhance v2 along these lines. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision 2021-02-05 14:27 ` Eric Blake @ 2021-02-05 14:36 ` Daniel P. Berrangé 0 siblings, 0 replies; 31+ messages in thread From: Daniel P. Berrangé @ 2021-02-05 14:36 UTC (permalink / raw) To: Eric Blake Cc: Kevin Wolf, vsementsov, qemu-block, rjones, tao3.xu, armbru, qemu-devel, Max Reitz On Fri, Feb 05, 2021 at 08:27:14AM -0600, Eric Blake wrote: > On 2/5/21 5:02 AM, Daniel P. Berrangé wrote: > > >> /* > >> - * Convert string to bytes, allowing either B/b for bytes, K/k for KB, > >> - * M/m for MB, G/g for GB or T/t for TB. End pointer will be returned > >> - * in *end, if not NULL. Return -ERANGE on overflow, and -EINVAL on > >> - * other error. > >> + * Convert size string to bytes. > >> + * > >> + * Allow either B/b for bytes, K/k for KB, M/m for MB, G/g for GB or > >> + * T/t for TB, with scaling based on @unit, and with @default_suffix > >> + * implied if no explicit suffix was given. > >> + * > >> + * The end pointer will be returned in *end, if not NULL. If there is > >> + * no fraction, the input can be decimal or hexadecimal; if there is a > >> + * fraction, then the input must be decimal and there must be a suffix > >> + * (possibly by @default_suffix) larger than Byte, and the fractional > >> + * portion may suffer from precision loss or rounding. The input must > >> + * be positive. > > > > Even though the test suite gives some illustrations, I think we should > > document here the patterns we're intending to support. IIUC, we aim for > > > > [quote] > > The size parsing supports the following syntaxes > > > > - 12345 - decimal, bytes > > - 12345{bBkKmMgGtT} - decimal, scaled bytes > > Yes. Actually 12345{bBkKmMgGtTpPeE}, as long as it doesn't overflow 16E. > > > - 12345.678 - fractional decimal, bytes > > No - this was already rejected prior to my patch, and my patch keeps it > as rejected (that's the whole mul_required bool, which checks whether > mul > 1). Oh yes, of course. > > - 12345.678{bBkKmMgGtT} - fractional decimal, scaled bytes > > Close; we reject bB in this situation for the same reason that we reject > fractional decimal without suffix. Also, patch 3/3 was questioning > whether the fractional portion must be exact or is permitted to round Yep, rejecting bB makes sense. > > > - 0x7FEE - hex, bytes > > Yes, and with the additional note that 'E' and 'B' are treated as hex > digits, not scale suffixes, in this form. > > > > > The following are intentionally not supported > > > > - octal > > Never has worked. > > > - fractional hex > > worked by accident, dropping it in this patch is not deemed worth a > deprecation period. > > > - floating point exponents > > worked by accident, dropping it in this patch is not deemed worth a > deprecation period. > > > [/quote] > > and with one more form: > > patch 2/3 > - 0x123abc{kKmMgGtTpP} - hex, scaled bytes, with limited set of working > suffixes, and slated for future removal (this one is barely plausible > enough that we need the deprecation period to see who uses it) Ok Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision 2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake ` (3 preceding siblings ...) 2021-02-05 11:02 ` Daniel P. Berrangé @ 2021-02-05 11:34 ` Daniel P. Berrangé 2021-02-05 14:36 ` Eric Blake 4 siblings, 1 reply; 31+ messages in thread From: Daniel P. Berrangé @ 2021-02-05 11:34 UTC (permalink / raw) To: Eric Blake Cc: Kevin Wolf, vsementsov, qemu-block, rjones, tao3.xu, armbru, qemu-devel, Max Reitz On Thu, Feb 04, 2021 at 01:07:06PM -0600, Eric Blake wrote: > We have multiple clients of qemu_strtosz (qemu-io, the opts visitor, > the keyval visitor), and it gets annoying that edge-case testing is > impacted by implicit rounding to 53 bits of precision due to parsing > with strtod(). As an example posted by Rich Jones: > $ nbdkit memory $(( 2**63 - 2**30 )) --run \ > 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' > write failed: Input/output error > > because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is > out of bounds. > > It is also worth noting that our existing parser, by virtue of using > strtod(), accepts decimal AND hex numbers, even though test-cutils > previously lacked any coverage of the latter. We do have existing > clients that expect a hex parse to work (for example, iotest 33 using > qemu-io -c "write -P 0xa 0x200 0x400"), but strtod() parses "08" as 8 > rather than as an invalid octal number, so we know there are no > clients that depend on octal. Our use of strtod() also means that > "0x1.8k" would actually parse as 1536 (the fraction is 8/16), rather > than 1843 (if the fraction were 8/10); but as this was not covered in > the testsuite, I have no qualms forbidding hex fractions as invalid, > so this patch declares that the use of fractions is only supported > with decimal input, and enhances the testsuite to document that. > > Our previous use of strtod() meant that -1 parsed as a negative; now > that we parse with strtoull(), negative values can wrap around module > 2^64, so we have to explicitly check whether the user passed in a '-'. > > We also had no testsuite coverage of "1.1e0k", which happened to parse > under strtod() but is unlikely to occur in practice; as long as we are > making things more robust, it is easy enough to reject the use of > exponents in a strtod parse. > > The fix is done by breaking the parse into an integer prefix (no loss > in precision), rejecting negative values (since we can no longer rely > on strtod() to do that), determining if a decimal or hexadecimal parse > was intended (with the new restriction that a fractional hex parse is > not allowed), and where appropriate, using a floating point fractional > parse (where we also scan to reject use of exponents in the fraction). > The bulk of the patch is then updates to the testsuite to match our > new precision, as well as adding new cases we reject (whether they > were rejected or inadvertenly accepted before). > > Signed-off-by: Eric Blake <eblake@redhat.com> > > --- > > Note that this approach differs from what has been attempted in the > past; see the thread starting at > https://lists.gnu.org/archive/html/qemu-devel/2019-12/msg00852.html > That approach tried to parse both as strtoull and strtod and take > whichever was longer, but that was harder to document. > --- > tests/test-cutils.c | 79 ++++++++++++++++++++++++++++++-------- > tests/test-keyval.c | 24 +++++++++--- > tests/test-qemu-opts.c | 20 +++++++--- > util/cutils.c | 77 +++++++++++++++++++++++++++---------- > tests/qemu-iotests/049.out | 7 +++- > 5 files changed, 156 insertions(+), 51 deletions(-) > > diff --git a/tests/test-cutils.c b/tests/test-cutils.c > index 1aa8351520ae..0c2c89d6f113 100644 > --- a/tests/test-cutils.c > +++ b/tests/test-cutils.c > @@ -1960,6 +1960,13 @@ static void test_qemu_strtosz_simple(void) > g_assert_cmpint(res, ==, 0); > g_assert(endptr == str + 1); > > + /* Leading 0 gives decimal results, not octal */ > + str = "08"; > + err = qemu_strtosz(str, &endptr, &res); > + g_assert_cmpint(err, ==, 0); > + g_assert_cmpint(res, ==, 8); > + g_assert(endptr == str + 2); > + > str = "12345"; > err = qemu_strtosz(str, &endptr, &res); > g_assert_cmpint(err, ==, 0); > @@ -1970,7 +1977,7 @@ static void test_qemu_strtosz_simple(void) > g_assert_cmpint(err, ==, 0); > g_assert_cmpint(res, ==, 12345); > > - /* Note: precision is 53 bits since we're parsing with strtod() */ > + /* Note: If there is no '.', we get full 64 bits of precision */ IIUC, our goal is that we should never loose precision for the non-fractional part. > > str = "9007199254740991"; /* 2^53-1 */ > err = qemu_strtosz(str, &endptr, &res); > @@ -1987,7 +1994,7 @@ static void test_qemu_strtosz_simple(void) > str = "9007199254740993"; /* 2^53+1 */ > err = qemu_strtosz(str, &endptr, &res); > g_assert_cmpint(err, ==, 0); > - g_assert_cmpint(res, ==, 0x20000000000000); /* rounded to 53 bits */ > + g_assert_cmpint(res, ==, 0x20000000000001); > g_assert(endptr == str + 16); > > str = "18446744073709549568"; /* 0xfffffffffffff800 (53 msbs set) */ > @@ -1999,11 +2006,35 @@ static void test_qemu_strtosz_simple(void) > str = "18446744073709550591"; /* 0xfffffffffffffbff */ > err = qemu_strtosz(str, &endptr, &res); > g_assert_cmpint(err, ==, 0); > - g_assert_cmpint(res, ==, 0xfffffffffffff800); /* rounded to 53 bits */ > + g_assert_cmpint(res, ==, 0xfffffffffffffbff); > g_assert(endptr == str + 20); > > - /* 0x7ffffffffffffe00..0x7fffffffffffffff get rounded to > - * 0x8000000000000000, thus -ERANGE; see test_qemu_strtosz_erange() */ > + str = "18446744073709551615"; /* 0xffffffffffffffff */ > + err = qemu_strtosz(str, &endptr, &res); > + g_assert_cmpint(err, ==, 0); > + g_assert_cmpint(res, ==, 0xffffffffffffffff); > + g_assert(endptr == str + 20); > + > +} > + > +static void test_qemu_strtosz_hex(void) > +{ > + const char *str; > + const char *endptr; > + int err; > + uint64_t res = 0xbaadf00d; > + > + str = "0x0"; > + err = qemu_strtosz(str, &endptr, &res); > + g_assert_cmpint(err, ==, 0); > + g_assert_cmpint(res, ==, 0); > + g_assert(endptr == str + 3); > + > + str = "0xa"; > + err = qemu_strtosz(str, &endptr, &res); > + g_assert_cmpint(err, ==, 0); > + g_assert_cmpint(res, ==, 10); > + g_assert(endptr == str + 3); I'd stick a '0xab' or "0xae" in there, to demonstrate that we're not accidentally applyin the scaling units for "b" and "e" in hex. > } > > static void test_qemu_strtosz_units(void) > @@ -2106,6 +2137,21 @@ static void test_qemu_strtosz_invalid(void) > err = qemu_strtosz(str, &endptr, &res); > g_assert_cmpint(err, ==, -EINVAL); > g_assert(endptr == str); > + > + str = "1.1e5"; > + err = qemu_strtosz(str, &endptr, &res); > + g_assert_cmpint(err, ==, -EINVAL); > + g_assert(endptr == str); > + > + str = "1.1B"; > + err = qemu_strtosz(str, &endptr, &res); > + g_assert_cmpint(err, ==, -EINVAL); > + g_assert(endptr == str); > + > + str = "0x1.8k"; > + err = qemu_strtosz(str, &endptr, &res); > + g_assert_cmpint(err, ==, -EINVAL); > + g_assert(endptr == str); > } A test case to reject negative numers ? > @@ -253,40 +264,66 @@ static int do_strtosz(const char *nptr, const char **end, > int retval; > const char *endptr; > unsigned char c; > - int mul_required = 0; > - double val, mul, integral, fraction; > + bool mul_required = false; > + uint64_t val; > + int64_t mul; > + double fraction = 0.0; > > - retval = qemu_strtod_finite(nptr, &endptr, &val); > + /* Parse integral portion as decimal. */ > + retval = qemu_strtou64(nptr, &endptr, 10, &val); > if (retval) { > goto out; > } > - fraction = modf(val, &integral); > - if (fraction != 0) { > - mul_required = 1; > + if (strchr(nptr, '-') != NULL) { > + retval = -ERANGE; > + goto out; > + } > + if (val == 0 && (*endptr == 'x' || *endptr == 'X')) { This works as an approach but it might be more clear to just check for hex upfront. if (g_str_has_prefix("0x", val) || g_str_has_prefix("0X", val)) { ... do hex parse.. } else { ... do dec parse.. } > + /* Input looks like hex, reparse, and insist on no fraction. */ > + retval = qemu_strtou64(nptr, &endptr, 16, &val); > + if (retval) { > + goto out; > + } > + if (*endptr == '.') { > + endptr = nptr; > + retval = -EINVAL; > + goto out; > + } > + } else if (*endptr == '.') { > + /* Input is fractional, insist on 0 <= fraction < 1, with no exponent */ > + retval = qemu_strtod_finite(endptr, &endptr, &fraction); > + if (retval) { > + endptr = nptr; > + goto out; > + } > + if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr) > + || memchr(nptr, 'E', endptr - nptr)) { Can this be simplified ? Fraction can be > 1, if-and only-if exponent syntax is used IIUC. Shouldn't we be passing 'endptr+1' as the first arg to memchr ? nptr points to the leading decimal part, and we only need to scan the fractional part. Also what happens with "1.1e" - that needs to be treated as a exabyte suffix, and not rejected as an exponent. We ought to test that if we don't already. > + endptr = nptr; > + retval = -EINVAL; > + goto out; > + } > + if (fraction != 0) { > + mul_required = true; > + } > } > c = *endptr; > mul = suffix_mul(c, unit); > - if (mul >= 0) { > + if (mul > 0) { > endptr++; > } else { > mul = suffix_mul(default_suffix, unit); > - assert(mul >= 0); > + assert(mul > 0); > } > if (mul == 1 && mul_required) { > + endptr = nptr; > retval = -EINVAL; > goto out; > } > - /* > - * Values near UINT64_MAX overflow to 2**64 when converting to double > - * precision. Compare against the maximum representable double precision > - * value below 2**64, computed as "the next value after 2**64 (0x1p64) in > - * the direction of 0". > - */ > - if ((val * mul > nextafter(0x1p64, 0)) || val < 0) { > + if (val > UINT64_MAX / mul) { > retval = -ERANGE; > goto out; > } > - *result = val * mul; > + *result = val * mul + (uint64_t) (fraction * mul); > retval = 0; > > out: Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision 2021-02-05 11:34 ` Daniel P. Berrangé @ 2021-02-05 14:36 ` Eric Blake 0 siblings, 0 replies; 31+ messages in thread From: Eric Blake @ 2021-02-05 14:36 UTC (permalink / raw) To: Daniel P. Berrangé Cc: Kevin Wolf, vsementsov, qemu-block, rjones, tao3.xu, armbru, qemu-devel, Max Reitz On 2/5/21 5:34 AM, Daniel P. Berrangé wrote: > On Thu, Feb 04, 2021 at 01:07:06PM -0600, Eric Blake wrote: >> We have multiple clients of qemu_strtosz (qemu-io, the opts visitor, >> the keyval visitor), and it gets annoying that edge-case testing is >> impacted by implicit rounding to 53 bits of precision due to parsing >> with strtod(). As an example posted by Rich Jones: >> $ nbdkit memory $(( 2**63 - 2**30 )) --run \ >> 'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" ' >> write failed: Input/output error >> >> @@ -1970,7 +1977,7 @@ static void test_qemu_strtosz_simple(void) >> g_assert_cmpint(err, ==, 0); >> g_assert_cmpint(res, ==, 12345); >> >> - /* Note: precision is 53 bits since we're parsing with strtod() */ >> + /* Note: If there is no '.', we get full 64 bits of precision */ > > IIUC, our goal is that we should never loose precision for the > non-fractional part. Correct. Maybe I can tweak that comment more as part of splitting this patch into testsuite improvements (highlighting existing shortfalls) then the actual implementation change (with its corresponding testsuite tweaks to demonstrate how it is fixed). >> + >> + str = "0xa"; >> + err = qemu_strtosz(str, &endptr, &res); >> + g_assert_cmpint(err, ==, 0); >> + g_assert_cmpint(res, ==, 10); >> + g_assert(endptr == str + 3); > > I'd stick a '0xab' or "0xae" in there, to demonstrate that we're > not accidentally applyin the scaling units for "b" and "e" in hex. Will do. >> + str = "0x1.8k"; >> + err = qemu_strtosz(str, &endptr, &res); >> + g_assert_cmpint(err, ==, -EINVAL); >> + g_assert(endptr == str); >> } > > A test case to reject negative numers ? We already have one, under the _erange section. Then again, there's an oddity: with ERANGE, we tend to leave endptr pointing to the end of what we did parse, while with EINVAL, we tend to leave endptr pointing to nptr saying we parsed nothing at all. It's hard to decide whether -1 is an ERANGE (negative is not permitted, but we successfully parsed two characters and so advanced endptr), or an EINVAL (we don't allow negative at all, so endptr is nptr). If anyone has preferences, now would be the time to document the semantics we want; although since we already have an existing test under _range for -1, I tried to keep that behavior unchanged. >> - retval = qemu_strtod_finite(nptr, &endptr, &val); >> + /* Parse integral portion as decimal. */ >> + retval = qemu_strtou64(nptr, &endptr, 10, &val); >> if (retval) { >> goto out; >> } >> - fraction = modf(val, &integral); >> - if (fraction != 0) { >> - mul_required = 1; >> + if (strchr(nptr, '-') != NULL) { >> + retval = -ERANGE; >> + goto out; >> + } >> + if (val == 0 && (*endptr == 'x' || *endptr == 'X')) { > > This works as an approach but it might be more clear to > just check for hex upfront. > > if (g_str_has_prefix("0x", val) || > g_str_has_prefix("0X", val)) { Won't work nicely. strtoull() allows for leading whitespace, at which point we are duplicating a lot of functionality to likewise skip such leading whitespace before checking the prefix. > ... do hex parse.. > } else { > ... do dec parse.. > } > > >> + /* Input looks like hex, reparse, and insist on no fraction. */ >> + retval = qemu_strtou64(nptr, &endptr, 16, &val); >> + if (retval) { >> + goto out; >> + } >> + if (*endptr == '.') { >> + endptr = nptr; >> + retval = -EINVAL; >> + goto out; >> + } >> + } else if (*endptr == '.') { >> + /* Input is fractional, insist on 0 <= fraction < 1, with no exponent */ >> + retval = qemu_strtod_finite(endptr, &endptr, &fraction); >> + if (retval) { >> + endptr = nptr; >> + goto out; >> + } >> + if (fraction >= 1.0 || memchr(nptr, 'e', endptr - nptr) >> + || memchr(nptr, 'E', endptr - nptr)) { > > Can this be simplified ? Fraction can be > 1, if-and only-if exponent > syntax is used IIUC. True for > 1.0, but false for == 1.0, because of the possibility of overflow while parsing a fraction like .99999999999999999999999999999999 > > Shouldn't we be passing 'endptr+1' as the first arg to memchr ? Not quite. I used endptr as both input and output in the strtod() call, so at this point, I'd have to add another variable to track where the '.' since endptr is no longer pointing there. But being lazy, I know that the integral portion contains no 'e', so even though the memchr is wasting effort searching bytes before the '.' for 'e', it is not wrong. > > nptr points to the leading decimal part, and we only need to > scan the fractional part. > > Also what happens with "1.1e" - that needs to be treated > as a exabyte suffix, and not rejected as an exponent. We > ought to test that if we don't already. Agree that we need to support it (well, depending on patch 3, it may be cleaner to write the test to look for support for 1.5e). Will make sure the test covers it. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 2/3] utils: Deprecate hex-with-suffix sizes 2021-02-04 19:07 [PATCH 0/3] Improve do_strtosz precision Eric Blake 2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake @ 2021-02-04 19:07 ` Eric Blake 2021-02-05 10:25 ` Vladimir Sementsov-Ogievskiy 2021-02-05 11:13 ` Daniel P. Berrangé 2021-02-04 19:07 ` [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes Eric Blake 2 siblings, 2 replies; 31+ messages in thread From: Eric Blake @ 2021-02-04 19:07 UTC (permalink / raw) To: qemu-devel Cc: vsementsov, berrange, qemu-block, reviewer:Incompatible changes, tao3.xu, rjones, armbru Supporting '0x20M' looks odd, particularly since we have an 'E' suffix that is ambiguous between a hex digit and the extremely large exibyte suffix, as well as a 'B' suffix for bytes. In practice, people using hex inputs are specifying values in bytes (and would have written 0x2000000, or possibly relied on default_suffix in the case of qemu_strtosz_MiB), and the use of scaling suffixes makes the most sense for inputs in decimal (where the user would write 32M). But rather than outright dropping support for hex-with-suffix, let's follow our deprecation policy. Sadly, since qemu_strtosz() does not have an Err** parameter, we pollute to stderr. Signed-off-by: Eric Blake <eblake@redhat.com> --- docs/system/deprecated.rst | 8 ++++++++ util/cutils.c | 6 +++++- 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst index 6ac757ed9fa7..c404c3926e6f 100644 --- a/docs/system/deprecated.rst +++ b/docs/system/deprecated.rst @@ -146,6 +146,14 @@ library enabled as a cryptography provider. Neither the ``nettle`` library, or the built-in cryptography provider are supported on FIPS enabled hosts. +hexadecimal sizes with scaling multipliers (since 6.0) +'''''''''''''''''''''''''''''''''''''''''''''''''''''' + +Input parameters that take a size value should only use a size suffix +(such as 'k' or 'M') when the base is written in decimal, and not when +the value is hexadecimal. That is, '0x20M' is deprecated, and should +be written either as '32M' or as '0x2000000'. + QEMU Machine Protocol (QMP) commands ------------------------------------ diff --git a/util/cutils.c b/util/cutils.c index 0234763bd70b..75190565cbb5 100644 --- a/util/cutils.c +++ b/util/cutils.c @@ -264,7 +264,7 @@ static int do_strtosz(const char *nptr, const char **end, int retval; const char *endptr; unsigned char c; - bool mul_required = false; + bool mul_required = false, hex = false; uint64_t val; int64_t mul; double fraction = 0.0; @@ -309,6 +309,10 @@ static int do_strtosz(const char *nptr, const char **end, c = *endptr; mul = suffix_mul(c, unit); if (mul > 0) { + if (hex) { + fprintf(stderr, "Using a multiplier suffix on hex numbers " + "is deprecated: %s\n", nptr); + } endptr++; } else { mul = suffix_mul(default_suffix, unit); -- 2.30.0 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes 2021-02-04 19:07 ` [PATCH 2/3] utils: Deprecate hex-with-suffix sizes Eric Blake @ 2021-02-05 10:25 ` Vladimir Sementsov-Ogievskiy 2021-02-05 10:31 ` Richard W.M. Jones 2021-02-05 13:38 ` Eric Blake 2021-02-05 11:13 ` Daniel P. Berrangé 1 sibling, 2 replies; 31+ messages in thread From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 10:25 UTC (permalink / raw) To: Eric Blake, qemu-devel Cc: rjones, berrange, armbru, qemu-block, tao3.xu, reviewer:Incompatible changes 04.02.2021 22:07, Eric Blake wrote: > Supporting '0x20M' looks odd, particularly since we have an 'E' suffix What about also deprecating 'E' suffix? (just my problem of reviewing previous patch) > that is ambiguous between a hex digit and the extremely large exibyte > suffix, as well as a 'B' suffix for bytes. In practice, people using > hex inputs are specifying values in bytes (and would have written > 0x2000000, or possibly relied on default_suffix in the case of > qemu_strtosz_MiB), and the use of scaling suffixes makes the most > sense for inputs in decimal (where the user would write 32M). But > rather than outright dropping support for hex-with-suffix, let's > follow our deprecation policy. Sadly, since qemu_strtosz() does not > have an Err** parameter, we pollute to stderr. > > Signed-off-by: Eric Blake <eblake@redhat.com> > --- > docs/system/deprecated.rst | 8 ++++++++ > util/cutils.c | 6 +++++- > 2 files changed, 13 insertions(+), 1 deletion(-) > > diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst > index 6ac757ed9fa7..c404c3926e6f 100644 > --- a/docs/system/deprecated.rst > +++ b/docs/system/deprecated.rst > @@ -146,6 +146,14 @@ library enabled as a cryptography provider. > Neither the ``nettle`` library, or the built-in cryptography provider are > supported on FIPS enabled hosts. > > +hexadecimal sizes with scaling multipliers (since 6.0) > +'''''''''''''''''''''''''''''''''''''''''''''''''''''' > + > +Input parameters that take a size value should only use a size suffix > +(such as 'k' or 'M') when the base is written in decimal, and not when > +the value is hexadecimal. That is, '0x20M' is deprecated, and should > +be written either as '32M' or as '0x2000000'. > + > QEMU Machine Protocol (QMP) commands > ------------------------------------ > > diff --git a/util/cutils.c b/util/cutils.c > index 0234763bd70b..75190565cbb5 100644 > --- a/util/cutils.c > +++ b/util/cutils.c > @@ -264,7 +264,7 @@ static int do_strtosz(const char *nptr, const char **end, > int retval; > const char *endptr; > unsigned char c; > - bool mul_required = false; > + bool mul_required = false, hex = false; > uint64_t val; > int64_t mul; > double fraction = 0.0; > @@ -309,6 +309,10 @@ static int do_strtosz(const char *nptr, const char **end, you forget to set hex to true in corresponding if(){...} > c = *endptr; > mul = suffix_mul(c, unit); > if (mul > 0) { > + if (hex) { > + fprintf(stderr, "Using a multiplier suffix on hex numbers " > + "is deprecated: %s\n", nptr); > + } > endptr++; > } else { > mul = suffix_mul(default_suffix, unit); should we also deprecate hex where default_suffix is not 'B' ? -- Best regards, Vladimir ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes 2021-02-05 10:25 ` Vladimir Sementsov-Ogievskiy @ 2021-02-05 10:31 ` Richard W.M. Jones 2021-02-05 13:38 ` Eric Blake 1 sibling, 0 replies; 31+ messages in thread From: Richard W.M. Jones @ 2021-02-05 10:31 UTC (permalink / raw) To: Vladimir Sementsov-Ogievskiy Cc: berrange, qemu-block, reviewer:Incompatible changes, tao3.xu, qemu-devel, armbru On Fri, Feb 05, 2021 at 01:25:18PM +0300, Vladimir Sementsov-Ogievskiy wrote: > 04.02.2021 22:07, Eric Blake wrote: > >Supporting '0x20M' looks odd, particularly since we have an 'E' suffix > > What about also deprecating 'E' suffix? (just my problem of reviewing previous patch) Ha! What if people want to specify exabytes?! That actually works at the moment: $ nbdkit memory 7E --run 'qemu-io -f raw -c "r -v 1E 512" "$uri"' Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/ ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes 2021-02-05 10:25 ` Vladimir Sementsov-Ogievskiy 2021-02-05 10:31 ` Richard W.M. Jones @ 2021-02-05 13:38 ` Eric Blake 1 sibling, 0 replies; 31+ messages in thread From: Eric Blake @ 2021-02-05 13:38 UTC (permalink / raw) To: Vladimir Sementsov-Ogievskiy, qemu-devel Cc: berrange, qemu-block, reviewer:Incompatible changes, tao3.xu, rjones, armbru On 2/5/21 4:25 AM, Vladimir Sementsov-Ogievskiy wrote: > 04.02.2021 22:07, Eric Blake wrote: >> Supporting '0x20M' looks odd, particularly since we have an 'E' suffix > > What about also deprecating 'E' suffix? (just my problem of reviewing > previous patch) No, we want to keep '1E' as a valid way to spell 1 exabyte. That has uses in the wild (admittedly corner-case, as that is a LOT of storage). It is only '0x1E' where the use of 'E' as a hex digit has priority over 'E' as a suffix meaning exabyte. > >> that is ambiguous between a hex digit and the extremely large exibyte >> suffix, as well as a 'B' suffix for bytes. In practice, people using >> hex inputs are specifying values in bytes (and would have written >> 0x2000000, or possibly relied on default_suffix in the case of >> qemu_strtosz_MiB), and the use of scaling suffixes makes the most >> sense for inputs in decimal (where the user would write 32M). But >> rather than outright dropping support for hex-with-suffix, let's >> follow our deprecation policy. Sadly, since qemu_strtosz() does not >> have an Err** parameter, we pollute to stderr. >> >> Signed-off-by: Eric Blake <eblake@redhat.com> >> --- >> +++ b/util/cutils.c >> @@ -264,7 +264,7 @@ static int do_strtosz(const char *nptr, const char >> **end, >> int retval; >> const char *endptr; >> unsigned char c; >> - bool mul_required = false; >> + bool mul_required = false, hex = false; >> uint64_t val; >> int64_t mul; >> double fraction = 0.0; >> @@ -309,6 +309,10 @@ static int do_strtosz(const char *nptr, const >> char **end, > > you forget to set hex to true in corresponding if(){...} > >> c = *endptr; >> mul = suffix_mul(c, unit); >> if (mul > 0) { >> + if (hex) { >> + fprintf(stderr, "Using a multiplier suffix on hex numbers " >> + "is deprecated: %s\n", nptr); >> + } D'oh. Now I get to rerun my tests to see when the warning triggers. >> endptr++; >> } else { >> mul = suffix_mul(default_suffix, unit); > > should we also deprecate hex where default_suffix is not 'B' ? That's exactly what this patch is (supposed to be) doing. If we parsed a hex number, and there was an explicit suffix at all (which is necessarily neither 'B' nor 'E', since those were already consumed while parsing the hex number), issue a warning. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes 2021-02-04 19:07 ` [PATCH 2/3] utils: Deprecate hex-with-suffix sizes Eric Blake 2021-02-05 10:25 ` Vladimir Sementsov-Ogievskiy @ 2021-02-05 11:13 ` Daniel P. Berrangé 2021-02-05 13:40 ` Eric Blake 1 sibling, 1 reply; 31+ messages in thread From: Daniel P. Berrangé @ 2021-02-05 11:13 UTC (permalink / raw) To: Eric Blake Cc: vsementsov, qemu-block, rjones, reviewer:Incompatible changes, tao3.xu, armbru, qemu-devel On Thu, Feb 04, 2021 at 01:07:07PM -0600, Eric Blake wrote: > Supporting '0x20M' looks odd, particularly since we have an 'E' suffix > that is ambiguous between a hex digit and the extremely large exibyte > suffix, as well as a 'B' suffix for bytes. In practice, people using > hex inputs are specifying values in bytes (and would have written > 0x2000000, or possibly relied on default_suffix in the case of > qemu_strtosz_MiB), and the use of scaling suffixes makes the most > sense for inputs in decimal (where the user would write 32M). But > rather than outright dropping support for hex-with-suffix, let's > follow our deprecation policy. Sadly, since qemu_strtosz() does not > have an Err** parameter, we pollute to stderr. Err** is only appropriate for errors, not warnings, so this statement can be removed. > > Signed-off-by: Eric Blake <eblake@redhat.com> > --- > docs/system/deprecated.rst | 8 ++++++++ > util/cutils.c | 6 +++++- > 2 files changed, 13 insertions(+), 1 deletion(-) > > diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst > index 6ac757ed9fa7..c404c3926e6f 100644 > --- a/docs/system/deprecated.rst > +++ b/docs/system/deprecated.rst > @@ -146,6 +146,14 @@ library enabled as a cryptography provider. > Neither the ``nettle`` library, or the built-in cryptography provider are > supported on FIPS enabled hosts. > > +hexadecimal sizes with scaling multipliers (since 6.0) > +'''''''''''''''''''''''''''''''''''''''''''''''''''''' > + > +Input parameters that take a size value should only use a size suffix > +(such as 'k' or 'M') when the base is written in decimal, and not when > +the value is hexadecimal. That is, '0x20M' is deprecated, and should > +be written either as '32M' or as '0x2000000'. > + > QEMU Machine Protocol (QMP) commands > ------------------------------------ > > diff --git a/util/cutils.c b/util/cutils.c > index 0234763bd70b..75190565cbb5 100644 > --- a/util/cutils.c > +++ b/util/cutils.c > @@ -264,7 +264,7 @@ static int do_strtosz(const char *nptr, const char **end, > int retval; > const char *endptr; > unsigned char c; > - bool mul_required = false; > + bool mul_required = false, hex = false; > uint64_t val; > int64_t mul; > double fraction = 0.0; > @@ -309,6 +309,10 @@ static int do_strtosz(const char *nptr, const char **end, > c = *endptr; > mul = suffix_mul(c, unit); > if (mul > 0) { > + if (hex) { > + fprintf(stderr, "Using a multiplier suffix on hex numbers " > + "is deprecated: %s\n", nptr); warn_report(), since IIUC, that'll get into HMP response correctly. > + } > endptr++; > } else { > mul = suffix_mul(default_suffix, unit); Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes 2021-02-05 11:13 ` Daniel P. Berrangé @ 2021-02-05 13:40 ` Eric Blake 2021-02-05 14:02 ` Daniel P. Berrangé 0 siblings, 1 reply; 31+ messages in thread From: Eric Blake @ 2021-02-05 13:40 UTC (permalink / raw) To: Daniel P. Berrangé Cc: vsementsov, qemu-block, rjones, reviewer:Incompatible changes, tao3.xu, armbru, qemu-devel On 2/5/21 5:13 AM, Daniel P. Berrangé wrote: > On Thu, Feb 04, 2021 at 01:07:07PM -0600, Eric Blake wrote: >> Supporting '0x20M' looks odd, particularly since we have an 'E' suffix >> that is ambiguous between a hex digit and the extremely large exibyte >> suffix, as well as a 'B' suffix for bytes. In practice, people using >> hex inputs are specifying values in bytes (and would have written >> 0x2000000, or possibly relied on default_suffix in the case of >> qemu_strtosz_MiB), and the use of scaling suffixes makes the most >> sense for inputs in decimal (where the user would write 32M). But >> rather than outright dropping support for hex-with-suffix, let's >> follow our deprecation policy. Sadly, since qemu_strtosz() does not >> have an Err** parameter, we pollute to stderr. > > Err** is only appropriate for errors, not warnings, so this statement > can be removed. The point of the comment was that we have no other mechanism for reporting the deprecation other than polluting to stderr. And if we ever remove the support, we'll either have to silently reject input that we used to accept, or plumb through Err** handling to allow better reporting of WHY we are rejecting input. But I didn't feel like adding Err** support now. >> @@ -309,6 +309,10 @@ static int do_strtosz(const char *nptr, const char **end, >> c = *endptr; >> mul = suffix_mul(c, unit); >> if (mul > 0) { >> + if (hex) { >> + fprintf(stderr, "Using a multiplier suffix on hex numbers " >> + "is deprecated: %s\n", nptr); > > warn_report(), since IIUC, that'll get into HMP response correctly. Yes, that sounds better. I'll use that in v2, as well as tweaking the commit message. > >> + } >> endptr++; >> } else { >> mul = suffix_mul(default_suffix, unit); > > Regards, > Daniel > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 2/3] utils: Deprecate hex-with-suffix sizes 2021-02-05 13:40 ` Eric Blake @ 2021-02-05 14:02 ` Daniel P. Berrangé 0 siblings, 0 replies; 31+ messages in thread From: Daniel P. Berrangé @ 2021-02-05 14:02 UTC (permalink / raw) To: Eric Blake Cc: vsementsov, qemu-block, rjones, reviewer:Incompatible changes, tao3.xu, armbru, qemu-devel On Fri, Feb 05, 2021 at 07:40:36AM -0600, Eric Blake wrote: > On 2/5/21 5:13 AM, Daniel P. Berrangé wrote: > > On Thu, Feb 04, 2021 at 01:07:07PM -0600, Eric Blake wrote: > >> Supporting '0x20M' looks odd, particularly since we have an 'E' suffix > >> that is ambiguous between a hex digit and the extremely large exibyte > >> suffix, as well as a 'B' suffix for bytes. In practice, people using > >> hex inputs are specifying values in bytes (and would have written > >> 0x2000000, or possibly relied on default_suffix in the case of > >> qemu_strtosz_MiB), and the use of scaling suffixes makes the most > >> sense for inputs in decimal (where the user would write 32M). But > >> rather than outright dropping support for hex-with-suffix, let's > >> follow our deprecation policy. Sadly, since qemu_strtosz() does not > >> have an Err** parameter, we pollute to stderr. > > > > Err** is only appropriate for errors, not warnings, so this statement > > can be removed. > > The point of the comment was that we have no other mechanism for > reporting the deprecation other than polluting to stderr. And if we > ever remove the support, we'll either have to silently reject input that > we used to accept, or plumb through Err** handling to allow better > reporting of WHY we are rejecting input. But I didn't feel like adding > Err** support now. Yeah, I guess what I meant was that warning on stderr is the expected standard practice for deprecations. We only need to worry about other thing once the deprecation turns into a hard error later. > > >> @@ -309,6 +309,10 @@ static int do_strtosz(const char *nptr, const char **end, > >> c = *endptr; > >> mul = suffix_mul(c, unit); > >> if (mul > 0) { > >> + if (hex) { > >> + fprintf(stderr, "Using a multiplier suffix on hex numbers " > >> + "is deprecated: %s\n", nptr); > > > > warn_report(), since IIUC, that'll get into HMP response correctly. > > Yes, that sounds better. I'll use that in v2, as well as tweaking the > commit message. > > > > >> + } > >> endptr++; > >> } else { > >> mul = suffix_mul(default_suffix, unit); > > > > Regards, > > Daniel > > > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. +1-919-301-3226 > Virtualization: qemu.org | libvirt.org Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes 2021-02-04 19:07 [PATCH 0/3] Improve do_strtosz precision Eric Blake 2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake 2021-02-04 19:07 ` [PATCH 2/3] utils: Deprecate hex-with-suffix sizes Eric Blake @ 2021-02-04 19:07 ` Eric Blake 2021-02-04 20:02 ` Eric Blake ` (3 more replies) 2 siblings, 4 replies; 31+ messages in thread From: Eric Blake @ 2021-02-04 19:07 UTC (permalink / raw) To: qemu-devel; +Cc: vsementsov, berrange, qemu-block, tao3.xu, rjones, armbru The value '1.1k' is inexact; 1126.4 bytes is not possible, so we happen to truncate it to 1126. Our use of fractional sizes is intended for convenience, but when a user specifies a fraction that is not a clean translation to binary, truncating/rounding behind their backs can cause confusion. Better is to deprecate inexact values, which still leaves '1.5k' as valid, but alerts the user to spell out their values as a precise byte number in cases where they are currently being rounded. Note that values like '0.1G' in the testsuite need adjustment as a result. Sadly, since qemu_strtosz() does not have an Err** parameter, we pollute to stderr. Signed-off-by: Eric Blake <eblake@redhat.com> --- tests/test-cutils.c | 4 ++-- tests/test-keyval.c | 4 ++-- tests/test-qemu-opts.c | 4 ++-- util/cutils.c | 4 ++++ 4 files changed, 10 insertions(+), 6 deletions(-) diff --git a/tests/test-cutils.c b/tests/test-cutils.c index 0c2c89d6f113..ad51fb1baa51 100644 --- a/tests/test-cutils.c +++ b/tests/test-cutils.c @@ -2095,14 +2095,14 @@ static void test_qemu_strtosz_units(void) static void test_qemu_strtosz_float(void) { - const char *str = "12.345M"; + const char *str = "12.125M"; int err; const char *endptr; uint64_t res = 0xbaadf00d; err = qemu_strtosz(str, &endptr, &res); g_assert_cmpint(err, ==, 0); - g_assert_cmpint(res, ==, 12.345 * MiB); + g_assert_cmpint(res, ==, 12.125 * MiB); g_assert(endptr == str + 7); } diff --git a/tests/test-keyval.c b/tests/test-keyval.c index 13be763650b2..c951ac54cd23 100644 --- a/tests/test-keyval.c +++ b/tests/test-keyval.c @@ -522,7 +522,7 @@ static void test_keyval_visit_size(void) visit_free(v); /* Suffixes */ - qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.1G,sz5=16777215T", + qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.125G,sz5=16777215T", NULL, NULL, &error_abort); v = qobject_input_visitor_new_keyval(QOBJECT(qdict)); qobject_unref(qdict); @@ -534,7 +534,7 @@ static void test_keyval_visit_size(void) visit_type_size(v, "sz3", &sz, &error_abort); g_assert_cmphex(sz, ==, 2 * MiB); visit_type_size(v, "sz4", &sz, &error_abort); - g_assert_cmphex(sz, ==, GiB / 10); + g_assert_cmphex(sz, ==, GiB / 8); visit_type_size(v, "sz5", &sz, &error_abort); g_assert_cmphex(sz, ==, 16777215ULL * TiB); visit_check_struct(v, &error_abort); diff --git a/tests/test-qemu-opts.c b/tests/test-qemu-opts.c index f79b698e6715..6a1ea1d01c4f 100644 --- a/tests/test-qemu-opts.c +++ b/tests/test-qemu-opts.c @@ -717,10 +717,10 @@ static void test_opts_parse_size(void) g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, 8); g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 1536); g_assert_cmphex(qemu_opt_get_size(opts, "size3", 0), ==, 2 * MiB); - opts = qemu_opts_parse(&opts_list_02, "size1=0.1G,size2=16777215T", + opts = qemu_opts_parse(&opts_list_02, "size1=0.125G,size2=16777215T", false, &error_abort); g_assert_cmpuint(opts_count(opts), ==, 2); - g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 10); + g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 8); g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 16777215ULL * TiB); /* Beyond limit with suffix */ diff --git a/util/cutils.c b/util/cutils.c index 75190565cbb5..5ec6101ae778 100644 --- a/util/cutils.c +++ b/util/cutils.c @@ -327,6 +327,10 @@ static int do_strtosz(const char *nptr, const char **end, retval = -ERANGE; goto out; } + if (mul_required && fraction * mul != (uint64_t) (fraction * mul)) { + fprintf(stderr, "Using a fractional size that is not an exact byte " + "multiple is deprecated: %s\n", nptr); + } *result = val * mul + (uint64_t) (fraction * mul); retval = 0; -- 2.30.0 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes 2021-02-04 19:07 ` [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes Eric Blake @ 2021-02-04 20:02 ` Eric Blake 2021-02-05 10:34 ` Richard W.M. Jones ` (2 subsequent siblings) 3 siblings, 0 replies; 31+ messages in thread From: Eric Blake @ 2021-02-04 20:02 UTC (permalink / raw) To: qemu-devel Cc: vsementsov, berrange, qemu-block, tao3.xu, armbru, libvirt-list@redhat.com On 2/4/21 1:07 PM, Eric Blake wrote: > The value '1.1k' is inexact; 1126.4 bytes is not possible, so we > happen to truncate it to 1126. Our use of fractional sizes is > intended for convenience, but when a user specifies a fraction that is > not a clean translation to binary, truncating/rounding behind their > backs can cause confusion. Better is to deprecate inexact values, > which still leaves '1.5k' as valid, but alerts the user to spell out > their values as a precise byte number in cases where they are > currently being rounded. And I promptly forgot to save my changes in my editor. Consider this squashed in: diff --git i/docs/system/deprecated.rst w/docs/system/deprecated.rst index c404c3926e6f..8e5e05a10642 100644 --- i/docs/system/deprecated.rst +++ w/docs/system/deprecated.rst @@ -154,6 +154,15 @@ Input parameters that take a size value should only use a size suffix the value is hexadecimal. That is, '0x20M' is deprecated, and should be written either as '32M' or as '0x2000000'. +inexact sizes via scaled fractions (since 6.0) +'''''''''''''''''''''''''''''''''''''''''''''' + +Input parameters that take a size value should only use a fractional +size (such as '1.5M') that will result in an exact byte value. The +use of inexact values (such as '1.1M') that require truncation or +rounding is deprecated, and you should instead consider writing your +unusual size in bytes (here, '1153433' or '1153434' as desired). + QEMU Machine Protocol (QMP) commands ------------------------------------ -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes 2021-02-04 19:07 ` [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes Eric Blake 2021-02-04 20:02 ` Eric Blake @ 2021-02-05 10:34 ` Richard W.M. Jones 2021-02-05 14:19 ` Eric Blake 2021-02-05 10:38 ` Vladimir Sementsov-Ogievskiy 2021-02-05 11:10 ` Daniel P. Berrangé 3 siblings, 1 reply; 31+ messages in thread From: Richard W.M. Jones @ 2021-02-05 10:34 UTC (permalink / raw) To: Eric Blake; +Cc: vsementsov, berrange, qemu-block, tao3.xu, qemu-devel, armbru On Thu, Feb 04, 2021 at 01:07:08PM -0600, Eric Blake wrote: > The value '1.1k' is inexact; 1126.4 bytes is not possible, so we > happen to truncate it to 1126. Our use of fractional sizes is > intended for convenience, but when a user specifies a fraction that is > not a clean translation to binary, truncating/rounding behind their > backs can cause confusion. Better is to deprecate inexact values, > which still leaves '1.5k' as valid, but alerts the user to spell out > their values as a precise byte number in cases where they are > currently being rounded. > > Note that values like '0.1G' in the testsuite need adjustment as a > result. > > Sadly, since qemu_strtosz() does not have an Err** parameter, we > pollute to stderr. Does "deprecate" mean there's a plan to eventually remove this? If so I think it should say that (in docs/system/deprecated.rst I think). I think it would be better to remove this now or in the future since it's a trap for users. Rich. > Signed-off-by: Eric Blake <eblake@redhat.com> > --- > tests/test-cutils.c | 4 ++-- > tests/test-keyval.c | 4 ++-- > tests/test-qemu-opts.c | 4 ++-- > util/cutils.c | 4 ++++ > 4 files changed, 10 insertions(+), 6 deletions(-) > > diff --git a/tests/test-cutils.c b/tests/test-cutils.c > index 0c2c89d6f113..ad51fb1baa51 100644 > --- a/tests/test-cutils.c > +++ b/tests/test-cutils.c > @@ -2095,14 +2095,14 @@ static void test_qemu_strtosz_units(void) > > static void test_qemu_strtosz_float(void) > { > - const char *str = "12.345M"; > + const char *str = "12.125M"; > int err; > const char *endptr; > uint64_t res = 0xbaadf00d; > > err = qemu_strtosz(str, &endptr, &res); > g_assert_cmpint(err, ==, 0); > - g_assert_cmpint(res, ==, 12.345 * MiB); > + g_assert_cmpint(res, ==, 12.125 * MiB); > g_assert(endptr == str + 7); > } > > diff --git a/tests/test-keyval.c b/tests/test-keyval.c > index 13be763650b2..c951ac54cd23 100644 > --- a/tests/test-keyval.c > +++ b/tests/test-keyval.c > @@ -522,7 +522,7 @@ static void test_keyval_visit_size(void) > visit_free(v); > > /* Suffixes */ > - qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.1G,sz5=16777215T", > + qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.125G,sz5=16777215T", > NULL, NULL, &error_abort); > v = qobject_input_visitor_new_keyval(QOBJECT(qdict)); > qobject_unref(qdict); > @@ -534,7 +534,7 @@ static void test_keyval_visit_size(void) > visit_type_size(v, "sz3", &sz, &error_abort); > g_assert_cmphex(sz, ==, 2 * MiB); > visit_type_size(v, "sz4", &sz, &error_abort); > - g_assert_cmphex(sz, ==, GiB / 10); > + g_assert_cmphex(sz, ==, GiB / 8); > visit_type_size(v, "sz5", &sz, &error_abort); > g_assert_cmphex(sz, ==, 16777215ULL * TiB); > visit_check_struct(v, &error_abort); > diff --git a/tests/test-qemu-opts.c b/tests/test-qemu-opts.c > index f79b698e6715..6a1ea1d01c4f 100644 > --- a/tests/test-qemu-opts.c > +++ b/tests/test-qemu-opts.c > @@ -717,10 +717,10 @@ static void test_opts_parse_size(void) > g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, 8); > g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 1536); > g_assert_cmphex(qemu_opt_get_size(opts, "size3", 0), ==, 2 * MiB); > - opts = qemu_opts_parse(&opts_list_02, "size1=0.1G,size2=16777215T", > + opts = qemu_opts_parse(&opts_list_02, "size1=0.125G,size2=16777215T", > false, &error_abort); > g_assert_cmpuint(opts_count(opts), ==, 2); > - g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 10); > + g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 8); > g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 16777215ULL * TiB); > > /* Beyond limit with suffix */ > diff --git a/util/cutils.c b/util/cutils.c > index 75190565cbb5..5ec6101ae778 100644 > --- a/util/cutils.c > +++ b/util/cutils.c > @@ -327,6 +327,10 @@ static int do_strtosz(const char *nptr, const char **end, > retval = -ERANGE; > goto out; > } > + if (mul_required && fraction * mul != (uint64_t) (fraction * mul)) { > + fprintf(stderr, "Using a fractional size that is not an exact byte " > + "multiple is deprecated: %s\n", nptr); > + } > *result = val * mul + (uint64_t) (fraction * mul); > retval = 0; > > -- > 2.30.0 -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes 2021-02-05 10:34 ` Richard W.M. Jones @ 2021-02-05 14:19 ` Eric Blake 0 siblings, 0 replies; 31+ messages in thread From: Eric Blake @ 2021-02-05 14:19 UTC (permalink / raw) To: Richard W.M. Jones Cc: vsementsov, berrange, qemu-block, tao3.xu, qemu-devel, armbru On 2/5/21 4:34 AM, Richard W.M. Jones wrote: > On Thu, Feb 04, 2021 at 01:07:08PM -0600, Eric Blake wrote: >> The value '1.1k' is inexact; 1126.4 bytes is not possible, so we >> happen to truncate it to 1126. Our use of fractional sizes is >> intended for convenience, but when a user specifies a fraction that is >> not a clean translation to binary, truncating/rounding behind their >> backs can cause confusion. Better is to deprecate inexact values, >> which still leaves '1.5k' as valid, but alerts the user to spell out >> their values as a precise byte number in cases where they are >> currently being rounded. >> >> Note that values like '0.1G' in the testsuite need adjustment as a >> result. >> >> Sadly, since qemu_strtosz() does not have an Err** parameter, we >> pollute to stderr. > > Does "deprecate" mean there's a plan to eventually remove this? If so > I think it should say that (in docs/system/deprecated.rst I think). Yes (well, if we agree to the patch; Dan has already voiced concern that we may not want 3/3 after all). And that's why I posted a followup message mentioning that I failed to commit that docs/ change before sending my v1. It will be in v2. > > I think it would be better to remove this now or in the future since > it's a trap for users. It's borderline - Markus has expressed a desire to remove inexact fractions, while Dan has expressed the desire that user convenience is important (as long as we are clear that non-fractional values are ALWAYS exact, and that the use of a fraction is for convenience at which point you KNOW you are risking rounding, then allowing both 1.1M and 1.5M instead of only one of the two is nicer). I submitted this patch because of IRC discussion, but if it is up to just me, I'd keep 2/3 but drop 3/3 (that is, I'm happy to deprecate 0x4000M, but not happy to deprecate 0.1G, but rather merely posted the patch to see what the fallout is because the question had been asked on IRC). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes 2021-02-04 19:07 ` [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes Eric Blake 2021-02-04 20:02 ` Eric Blake 2021-02-05 10:34 ` Richard W.M. Jones @ 2021-02-05 10:38 ` Vladimir Sementsov-Ogievskiy 2021-02-05 11:10 ` Daniel P. Berrangé 3 siblings, 0 replies; 31+ messages in thread From: Vladimir Sementsov-Ogievskiy @ 2021-02-05 10:38 UTC (permalink / raw) To: Eric Blake, qemu-devel; +Cc: rjones, berrange, armbru, qemu-block, tao3.xu 04.02.2021 22:07, Eric Blake wrote: > The value '1.1k' is inexact; 1126.4 bytes is not possible, so we > happen to truncate it to 1126. Our use of fractional sizes is > intended for convenience, but when a user specifies a fraction that is > not a clean translation to binary, truncating/rounding behind their > backs can cause confusion. Better is to deprecate inexact values, > which still leaves '1.5k' as valid, but alerts the user to spell out > their values as a precise byte number in cases where they are > currently being rounded. > > Note that values like '0.1G' in the testsuite need adjustment as a > result. > Honestly, I don't agree with the reasoning. User shouldn't expect something extremely precise when he (is it correct use "he" here, or what is the right word?) specify fractional size. I doubt that someone use fractional sizes for production. Also, this will break almost everything except for 0.5.. I don't think such deprecation will force people to change their habits to use 0.125 instead of 0.1. Simpler is to move to integrals and don't care. So, in this way we can just deprecate fractional sizes at all and don't care with them anymore. Or allow only "0.5" fraction as the only exclusion :) > Sadly, since qemu_strtosz() does not have an Err** parameter, we > pollute to stderr. > > Signed-off-by: Eric Blake <eblake@redhat.com> > --- > tests/test-cutils.c | 4 ++-- > tests/test-keyval.c | 4 ++-- > tests/test-qemu-opts.c | 4 ++-- > util/cutils.c | 4 ++++ > 4 files changed, 10 insertions(+), 6 deletions(-) > > diff --git a/tests/test-cutils.c b/tests/test-cutils.c > index 0c2c89d6f113..ad51fb1baa51 100644 > --- a/tests/test-cutils.c > +++ b/tests/test-cutils.c > @@ -2095,14 +2095,14 @@ static void test_qemu_strtosz_units(void) > > static void test_qemu_strtosz_float(void) > { > - const char *str = "12.345M"; > + const char *str = "12.125M"; > int err; > const char *endptr; > uint64_t res = 0xbaadf00d; > > err = qemu_strtosz(str, &endptr, &res); > g_assert_cmpint(err, ==, 0); > - g_assert_cmpint(res, ==, 12.345 * MiB); > + g_assert_cmpint(res, ==, 12.125 * MiB); > g_assert(endptr == str + 7); > } > > diff --git a/tests/test-keyval.c b/tests/test-keyval.c > index 13be763650b2..c951ac54cd23 100644 > --- a/tests/test-keyval.c > +++ b/tests/test-keyval.c > @@ -522,7 +522,7 @@ static void test_keyval_visit_size(void) > visit_free(v); > > /* Suffixes */ > - qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.1G,sz5=16777215T", > + qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.125G,sz5=16777215T", > NULL, NULL, &error_abort); > v = qobject_input_visitor_new_keyval(QOBJECT(qdict)); > qobject_unref(qdict); > @@ -534,7 +534,7 @@ static void test_keyval_visit_size(void) > visit_type_size(v, "sz3", &sz, &error_abort); > g_assert_cmphex(sz, ==, 2 * MiB); > visit_type_size(v, "sz4", &sz, &error_abort); > - g_assert_cmphex(sz, ==, GiB / 10); > + g_assert_cmphex(sz, ==, GiB / 8); > visit_type_size(v, "sz5", &sz, &error_abort); > g_assert_cmphex(sz, ==, 16777215ULL * TiB); > visit_check_struct(v, &error_abort); > diff --git a/tests/test-qemu-opts.c b/tests/test-qemu-opts.c > index f79b698e6715..6a1ea1d01c4f 100644 > --- a/tests/test-qemu-opts.c > +++ b/tests/test-qemu-opts.c > @@ -717,10 +717,10 @@ static void test_opts_parse_size(void) > g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, 8); > g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 1536); > g_assert_cmphex(qemu_opt_get_size(opts, "size3", 0), ==, 2 * MiB); > - opts = qemu_opts_parse(&opts_list_02, "size1=0.1G,size2=16777215T", > + opts = qemu_opts_parse(&opts_list_02, "size1=0.125G,size2=16777215T", > false, &error_abort); > g_assert_cmpuint(opts_count(opts), ==, 2); > - g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 10); > + g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 8); > g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 16777215ULL * TiB); > > /* Beyond limit with suffix */ > diff --git a/util/cutils.c b/util/cutils.c > index 75190565cbb5..5ec6101ae778 100644 > --- a/util/cutils.c > +++ b/util/cutils.c > @@ -327,6 +327,10 @@ static int do_strtosz(const char *nptr, const char **end, > retval = -ERANGE; > goto out; > } > + if (mul_required && fraction * mul != (uint64_t) (fraction * mul)) { > + fprintf(stderr, "Using a fractional size that is not an exact byte " > + "multiple is deprecated: %s\n", nptr); > + } > *result = val * mul + (uint64_t) (fraction * mul); > retval = 0; > -- Best regards, Vladimir ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes 2021-02-04 19:07 ` [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes Eric Blake ` (2 preceding siblings ...) 2021-02-05 10:38 ` Vladimir Sementsov-Ogievskiy @ 2021-02-05 11:10 ` Daniel P. Berrangé 2021-02-05 14:28 ` Eric Blake 3 siblings, 1 reply; 31+ messages in thread From: Daniel P. Berrangé @ 2021-02-05 11:10 UTC (permalink / raw) To: Eric Blake; +Cc: vsementsov, qemu-block, rjones, tao3.xu, armbru, qemu-devel On Thu, Feb 04, 2021 at 01:07:08PM -0600, Eric Blake wrote: > The value '1.1k' is inexact; 1126.4 bytes is not possible, so we > happen to truncate it to 1126. Our use of fractional sizes is > intended for convenience, but when a user specifies a fraction that is > not a clean translation to binary, truncating/rounding behind their > backs can cause confusion. Better is to deprecate inexact values, > which still leaves '1.5k' as valid, but alerts the user to spell out > their values as a precise byte number in cases where they are > currently being rounded. I don't think we should be deprecating this, as I think it makes it very user hostile. Users who require exact answers, won't be using fractional syntax in the first place. IOW, by using fractional syntax you've decided that approximation is acceptable. Given that, I should not have to worry about whether or not the fraction I'm using is exact or truncated. It is horrible usability to say that "1.1k" is invalid, while "1.5k" is valid - both are valid from my POV as a user of this. > Note that values like '0.1G' in the testsuite need adjustment as a > result. > > Sadly, since qemu_strtosz() does not have an Err** parameter, we > pollute to stderr. This is only an warning, so setting an Err ** would not be appropriate right now. None the less we should add an Err **, because many of the callers want an Err ** object populated, or use error_report(). > > Signed-off-by: Eric Blake <eblake@redhat.com> > --- > tests/test-cutils.c | 4 ++-- > tests/test-keyval.c | 4 ++-- > tests/test-qemu-opts.c | 4 ++-- > util/cutils.c | 4 ++++ > 4 files changed, 10 insertions(+), 6 deletions(-) > > diff --git a/tests/test-cutils.c b/tests/test-cutils.c > index 0c2c89d6f113..ad51fb1baa51 100644 > --- a/tests/test-cutils.c > +++ b/tests/test-cutils.c > @@ -2095,14 +2095,14 @@ static void test_qemu_strtosz_units(void) > > static void test_qemu_strtosz_float(void) > { > - const char *str = "12.345M"; > + const char *str = "12.125M"; > int err; > const char *endptr; > uint64_t res = 0xbaadf00d; > > err = qemu_strtosz(str, &endptr, &res); > g_assert_cmpint(err, ==, 0); > - g_assert_cmpint(res, ==, 12.345 * MiB); > + g_assert_cmpint(res, ==, 12.125 * MiB); > g_assert(endptr == str + 7); > } > > diff --git a/tests/test-keyval.c b/tests/test-keyval.c > index 13be763650b2..c951ac54cd23 100644 > --- a/tests/test-keyval.c > +++ b/tests/test-keyval.c > @@ -522,7 +522,7 @@ static void test_keyval_visit_size(void) > visit_free(v); > > /* Suffixes */ > - qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.1G,sz5=16777215T", > + qdict = keyval_parse("sz1=8b,sz2=1.5k,sz3=2M,sz4=0.125G,sz5=16777215T", > NULL, NULL, &error_abort); > v = qobject_input_visitor_new_keyval(QOBJECT(qdict)); > qobject_unref(qdict); > @@ -534,7 +534,7 @@ static void test_keyval_visit_size(void) > visit_type_size(v, "sz3", &sz, &error_abort); > g_assert_cmphex(sz, ==, 2 * MiB); > visit_type_size(v, "sz4", &sz, &error_abort); > - g_assert_cmphex(sz, ==, GiB / 10); > + g_assert_cmphex(sz, ==, GiB / 8); > visit_type_size(v, "sz5", &sz, &error_abort); > g_assert_cmphex(sz, ==, 16777215ULL * TiB); > visit_check_struct(v, &error_abort); > diff --git a/tests/test-qemu-opts.c b/tests/test-qemu-opts.c > index f79b698e6715..6a1ea1d01c4f 100644 > --- a/tests/test-qemu-opts.c > +++ b/tests/test-qemu-opts.c > @@ -717,10 +717,10 @@ static void test_opts_parse_size(void) > g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, 8); > g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 1536); > g_assert_cmphex(qemu_opt_get_size(opts, "size3", 0), ==, 2 * MiB); > - opts = qemu_opts_parse(&opts_list_02, "size1=0.1G,size2=16777215T", > + opts = qemu_opts_parse(&opts_list_02, "size1=0.125G,size2=16777215T", > false, &error_abort); > g_assert_cmpuint(opts_count(opts), ==, 2); > - g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 10); > + g_assert_cmphex(qemu_opt_get_size(opts, "size1", 0), ==, GiB / 8); > g_assert_cmphex(qemu_opt_get_size(opts, "size2", 0), ==, 16777215ULL * TiB); > > /* Beyond limit with suffix */ > diff --git a/util/cutils.c b/util/cutils.c > index 75190565cbb5..5ec6101ae778 100644 > --- a/util/cutils.c > +++ b/util/cutils.c > @@ -327,6 +327,10 @@ static int do_strtosz(const char *nptr, const char **end, > retval = -ERANGE; > goto out; > } > + if (mul_required && fraction * mul != (uint64_t) (fraction * mul)) { > + fprintf(stderr, "Using a fractional size that is not an exact byte " > + "multiple is deprecated: %s\n", nptr); > + } > *result = val * mul + (uint64_t) (fraction * mul); > retval = 0; > > -- > 2.30.0 > Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes 2021-02-05 11:10 ` Daniel P. Berrangé @ 2021-02-05 14:28 ` Eric Blake 2021-02-05 14:40 ` Daniel P. Berrangé 0 siblings, 1 reply; 31+ messages in thread From: Eric Blake @ 2021-02-05 14:28 UTC (permalink / raw) To: Daniel P. Berrangé Cc: vsementsov, qemu-block, rjones, tao3.xu, armbru, qemu-devel On 2/5/21 5:10 AM, Daniel P. Berrangé wrote: > On Thu, Feb 04, 2021 at 01:07:08PM -0600, Eric Blake wrote: >> The value '1.1k' is inexact; 1126.4 bytes is not possible, so we >> happen to truncate it to 1126. Our use of fractional sizes is >> intended for convenience, but when a user specifies a fraction that is >> not a clean translation to binary, truncating/rounding behind their >> backs can cause confusion. Better is to deprecate inexact values, >> which still leaves '1.5k' as valid, but alerts the user to spell out >> their values as a precise byte number in cases where they are >> currently being rounded. > > I don't think we should be deprecating this, as I think it makes > it very user hostile. Users who require exact answers, won't be > using fractional syntax in the first place. IOW, by using fractional > syntax you've decided that approximation is acceptable. Given that, > I should not have to worry about whether or not the fraction I'm > using is exact or truncated. It is horrible usability to say that > "1.1k" is invalid, while "1.5k" is valid - both are valid from my > POV as a user of this. > > > >> Note that values like '0.1G' in the testsuite need adjustment as a >> result. >> >> Sadly, since qemu_strtosz() does not have an Err** parameter, we >> pollute to stderr. > > This is only an warning, so setting an Err ** would not be appropriate > right now. > > None the less we should add an Err **, because many of the callers > want an Err ** object populated, or use error_report(). That is more effort. What's the consensus - is it important enough that I should spend that effort getting rid of technical debt by adding versions of qemu_strto* that take Err** at this point in time? -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes 2021-02-05 14:28 ` Eric Blake @ 2021-02-05 14:40 ` Daniel P. Berrangé 0 siblings, 0 replies; 31+ messages in thread From: Daniel P. Berrangé @ 2021-02-05 14:40 UTC (permalink / raw) To: Eric Blake; +Cc: vsementsov, qemu-block, rjones, tao3.xu, armbru, qemu-devel On Fri, Feb 05, 2021 at 08:28:31AM -0600, Eric Blake wrote: > On 2/5/21 5:10 AM, Daniel P. Berrangé wrote: > > On Thu, Feb 04, 2021 at 01:07:08PM -0600, Eric Blake wrote: > >> The value '1.1k' is inexact; 1126.4 bytes is not possible, so we > >> happen to truncate it to 1126. Our use of fractional sizes is > >> intended for convenience, but when a user specifies a fraction that is > >> not a clean translation to binary, truncating/rounding behind their > >> backs can cause confusion. Better is to deprecate inexact values, > >> which still leaves '1.5k' as valid, but alerts the user to spell out > >> their values as a precise byte number in cases where they are > >> currently being rounded. > > > > I don't think we should be deprecating this, as I think it makes > > it very user hostile. Users who require exact answers, won't be > > using fractional syntax in the first place. IOW, by using fractional > > syntax you've decided that approximation is acceptable. Given that, > > I should not have to worry about whether or not the fraction I'm > > using is exact or truncated. It is horrible usability to say that > > "1.1k" is invalid, while "1.5k" is valid - both are valid from my > > POV as a user of this. > > > > > > > >> Note that values like '0.1G' in the testsuite need adjustment as a > >> result. > >> > >> Sadly, since qemu_strtosz() does not have an Err** parameter, we > >> pollute to stderr. > > > > This is only an warning, so setting an Err ** would not be appropriate > > right now. > > > > None the less we should add an Err **, because many of the callers > > want an Err ** object populated, or use error_report(). > > That is more effort. What's the consensus - is it important enough that > I should spend that effort getting rid of technical debt by adding > versions of qemu_strto* that take Err** at this point in time? To be clear, I'm not requesting that you make it use Err** right now. I just raise it as an idea for future technical debt removal. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2021-02-05 14:41 UTC | newest] Thread overview: 31+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-02-04 19:07 [PATCH 0/3] Improve do_strtosz precision Eric Blake 2021-02-04 19:07 ` [PATCH 1/3] utils: Improve qemu_strtosz() to have 64 bits of precision Eric Blake 2021-02-04 20:12 ` Eric Blake 2021-02-05 10:06 ` Vladimir Sementsov-Ogievskiy 2021-02-05 10:18 ` Vladimir Sementsov-Ogievskiy 2021-02-05 14:06 ` Eric Blake 2021-02-05 14:10 ` Daniel P. Berrangé 2021-02-05 10:07 ` Vladimir Sementsov-Ogievskiy 2021-02-05 14:12 ` Eric Blake 2021-02-05 10:28 ` Richard W.M. Jones 2021-02-05 14:15 ` Eric Blake 2021-02-05 11:02 ` Daniel P. Berrangé 2021-02-05 14:27 ` Eric Blake 2021-02-05 14:36 ` Daniel P. Berrangé 2021-02-05 11:34 ` Daniel P. Berrangé 2021-02-05 14:36 ` Eric Blake 2021-02-04 19:07 ` [PATCH 2/3] utils: Deprecate hex-with-suffix sizes Eric Blake 2021-02-05 10:25 ` Vladimir Sementsov-Ogievskiy 2021-02-05 10:31 ` Richard W.M. Jones 2021-02-05 13:38 ` Eric Blake 2021-02-05 11:13 ` Daniel P. Berrangé 2021-02-05 13:40 ` Eric Blake 2021-02-05 14:02 ` Daniel P. Berrangé 2021-02-04 19:07 ` [PATCH 3/3] utils: Deprecate inexact fractional suffix sizes Eric Blake 2021-02-04 20:02 ` Eric Blake 2021-02-05 10:34 ` Richard W.M. Jones 2021-02-05 14:19 ` Eric Blake 2021-02-05 10:38 ` Vladimir Sementsov-Ogievskiy 2021-02-05 11:10 ` Daniel P. Berrangé 2021-02-05 14:28 ` Eric Blake 2021-02-05 14:40 ` Daniel P. Berrangé
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).