From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
Ulf Hansson <ulf.hansson@linaro.org>,
James Bottomley <JBottomley@Odin.com>,
Kees Cook <keescook@chromium.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size()
Date: Thu, 29 Oct 2015 14:51:32 +0100 [thread overview]
Message-ID: <87eggdzst7.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <87twpcvxc6.fsf@rasmusvillemoes.dk> (Rasmus Villemoes's message of "Tue, 27 Oct 2015 22:02:49 +0100")
Rasmus Villemoes <linux@rasmusvillemoes.dk> writes:
> On Tue, Oct 27 2015, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> string_get_size() loses precision when there is a remainder for
>> blk_size / divisor[units] and size is big enough. E.g
>> string_get_size(8192, 4096, STRING_UNITS_10, ...) returns "32.7 MB"
>> while it is supposed to return "33.5 MB". For some artificial inputs
>> the result can be ridiculously wrong, e.g.
>> string_get_size(3000, 1900, STRING_UNITS_10, ...) returns "3.00 MB"
>> when "5.70 MB" is expected.
>>
>> The issues comes from the fact than we through away
>> blk_size / divisor[units] remainder when size is > exp. This can be fixed
>> by saving it and doing some non-trivial calculations later to fix the error
>> but that would make this function even more cumbersome. Slightly re-factor
>> the function to not lose the precision for all inputs.
>>
>> The overall complexity of this function comes from the fact that size can
>> be huge and we don't want to do size * blk_size as it can overflow. Do the
>> math in two steps:
>> 1) Reduce size to something < blk_size * divisor[units]
>> 2) Multiply the result (and the remainder) by blk_size and do final
>> calculations.
>>
>> Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>> Changes since v1:
>> - Check against blk_size == 0 [Rasmus Villemoes]
>> - Do not rename 'i' to 'order' [Andy Shevchenko]
>> ---
>> lib/string_helpers.c | 37 ++++++++++++++++++++++++-------------
>> 1 file changed, 24 insertions(+), 13 deletions(-)
>>
>> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
>> index f6c27dc..eba8e82 100644
>> --- a/lib/string_helpers.c
>> +++ b/lib/string_helpers.c
>> @@ -44,7 +44,8 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>> [STRING_UNITS_2] = 1024,
>> };
>> int i, j;
>> - u32 remainder = 0, sf_cap, exp;
>> + u64 remainder = 0;
>> + u32 sf_cap;
>> char tmp[8];
>> const char *unit;
>>
>> @@ -53,28 +54,36 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>> if (!size)
>> goto out;
>>
>> - while (blk_size >= divisor[units]) {
>> - remainder = do_div(blk_size, divisor[units]);
>> - i++;
>> + if (!blk_size) {
>> + WARN_ON(1);
>> + size = 0;
>> + goto out;
>> }
>
> I don't think we need to handle that, but if you want, please put the
> WARN inside the conditional (so say "if (WARN_ON(!blk_size)) {...}". And
> don't make it _ONCE; the ordinary version is slightly cheaper (since
> there's no static bool __warned and no code to manage that).
Ok, I'll do that in a separate patch and hope Andy is not going to
complain about not adding _ONCE.
>
>>
>> - exp = divisor[units] / blk_size;
>> /*
>> - * size must be strictly greater than exp here to ensure that remainder
>> - * is greater than divisor[units] coming out of the if below.
>> + * size can be huge and doing size * blk_size right away can overflow.
>> + * As a first step reduce huge size to something less than
>> + * blk_size * divisor[units].
>> */
>> - if (size > exp) {
>> + while (size > (u64)blk_size * divisor[units]) {
>
> It seems weird that we reduce _more_ for smaller block sizes - indeed,
> for blk_size==1 we end up reducing size to <= 1000 (or 1024), which is
> certainly a good way to lose some precision. Also, this relies on
> blk_size being smaller than (roughly) sqrt(U64_MAX/divisor[units]),
> which is of course true in practice, but a slightly annoying implicit
> assumption.
>
> I do think that my approach of reducing size till it's smaller than
> U64_MAX/blk_size is simpler and better. There's much less fixup
> code. It's simply
>
> while (size > div_u64(U64_MAX, blk_size) {
> do_div(size, divisor[units]);
> ++i;
> }
> size *= blk_size;
> while (size > divisor[units]) {
> remainder = do_div(size, divisor[units]);
> ++i;
> }
>
> which is as self-explaining as it gets.
>
OK, let me borrow this idea from you and change Reported-by: tag with
Suggested-by:. I'll test and send v3.
> And yes, one needs to use the include/linux/math64.h functions/macros
> for u64/u32 divisions - otherwise I'm pretty sure one will get friendly
> mails from the build bot.
>
> while (size > )
>> remainder = do_div(size, divisor[units]);
>> - remainder *= blk_size;
>> i++;
>> - } else {
>> - remainder *= size;
>> }
>>
>> + /* Now we're OK with doing size * blk_size, it won't overflow. */
>> size *= blk_size;
>> + remainder *= blk_size;
>> + /*
>> + * We were doing partial multiplication by blk_size.
>> + * remainder >= divisor[units] here means size should be increased.
>> + */
>> size += remainder / divisor[units];
>> - remainder %= divisor[units];
>> + remainder -= (remainder / divisor[units]) * divisor[units];
>>
>> + /*
>> + * Normalize. size >= divisor[units] means we still have enough
>> + * precision and dropping remainder is fine.
>> + */
>> while (size >= divisor[units]) {
>> remainder = do_div(size, divisor[units]);
>> i++;
>> @@ -87,7 +96,8 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>> if (j) {
>> remainder *= 1000;
>> remainder /= divisor[units];
>> - snprintf(tmp, sizeof(tmp), ".%03u", remainder);
>> + /* remainder is < divisor[units] here, (u32) is legit */
>
> What is actually important is that remainder is < 1000. remainder was
> initially < divisor[units], but then the multiplication and division
> transformed that into "1/1000s" of whatever unit we're using.
>
>> + snprintf(tmp, sizeof(tmp), ".%03u", (u32)remainder);
>> tmp[j+1] = '\0';
>> }
>>
>> @@ -97,6 +107,7 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>> else
>> unit = units_str[units][i];
>>
>> + /* size is < divisor[units] here, (u32) is legit */
>> snprintf(buf, len, "%u%s %s", (u32)size,
>> tmp, unit);
>> }
--
Vitaly
next prev parent reply other threads:[~2015-10-29 13:51 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-27 14:06 [PATCH v2 0/3] lib/string_helpers: fix precision issues and introduce tests Vitaly Kuznetsov
2015-10-27 14:06 ` [PATCH v2 1/3] lib/string_helpers: change blk_size to u32 for string_get_size() interface Vitaly Kuznetsov
2015-10-27 14:06 ` [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size() Vitaly Kuznetsov
2015-10-27 15:25 ` Andy Shevchenko
2015-10-27 16:16 ` Vitaly Kuznetsov
2015-10-27 16:24 ` Andy Shevchenko
2015-10-27 21:02 ` Rasmus Villemoes
2015-10-29 13:51 ` Vitaly Kuznetsov [this message]
2015-10-27 14:06 ` [PATCH v2 3/3] lib/test-string_helpers.c: add string_get_size() tests Vitaly Kuznetsov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87eggdzst7.fsf@vitty.brq.redhat.com \
--to=vkuznets@redhat.com \
--cc=JBottomley@Odin.com \
--cc=akpm@linux-foundation.org \
--cc=andriy.shevchenko@linux.intel.com \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@rasmusvillemoes.dk \
--cc=ulf.hansson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).