From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
Ulf Hansson <ulf.hansson@linaro.org>,
James Bottomley <JBottomley@Odin.com>,
Kees Cook <keescook@chromium.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size()
Date: Thu, 29 Oct 2015 14:51:32 +0100 [thread overview]
Message-ID: <87eggdzst7.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <87twpcvxc6.fsf@rasmusvillemoes.dk> (Rasmus Villemoes's message of "Tue, 27 Oct 2015 22:02:49 +0100")
Rasmus Villemoes <linux@rasmusvillemoes.dk> writes:
> On Tue, Oct 27 2015, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> string_get_size() loses precision when there is a remainder for
>> blk_size / divisor[units] and size is big enough. E.g
>> string_get_size(8192, 4096, STRING_UNITS_10, ...) returns "32.7 MB"
>> while it is supposed to return "33.5 MB". For some artificial inputs
>> the result can be ridiculously wrong, e.g.
>> string_get_size(3000, 1900, STRING_UNITS_10, ...) returns "3.00 MB"
>> when "5.70 MB" is expected.
>>
>> The issues comes from the fact than we through away
>> blk_size / divisor[units] remainder when size is > exp. This can be fixed
>> by saving it and doing some non-trivial calculations later to fix the error
>> but that would make this function even more cumbersome. Slightly re-factor
>> the function to not lose the precision for all inputs.
>>
>> The overall complexity of this function comes from the fact that size can
>> be huge and we don't want to do size * blk_size as it can overflow. Do the
>> math in two steps:
>> 1) Reduce size to something < blk_size * divisor[units]
>> 2) Multiply the result (and the remainder) by blk_size and do final
>> calculations.
>>
>> Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>> Changes since v1:
>> - Check against blk_size == 0 [Rasmus Villemoes]
>> - Do not rename 'i' to 'order' [Andy Shevchenko]
>> ---
>> lib/string_helpers.c | 37 ++++++++++++++++++++++++-------------
>> 1 file changed, 24 insertions(+), 13 deletions(-)
>>
>> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
>> index f6c27dc..eba8e82 100644
>> --- a/lib/string_helpers.c
>> +++ b/lib/string_helpers.c
>> @@ -44,7 +44,8 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>> [STRING_UNITS_2] = 1024,
>> };
>> int i, j;
>> - u32 remainder = 0, sf_cap, exp;
>> + u64 remainder = 0;
>> + u32 sf_cap;
>> char tmp[8];
>> const char *unit;
>>
>> @@ -53,28 +54,36 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>> if (!size)
>> goto out;
>>
>> - while (blk_size >= divisor[units]) {
>> - remainder = do_div(blk_size, divisor[units]);
>> - i++;
>> + if (!blk_size) {
>> + WARN_ON(1);
>> + size = 0;
>> + goto out;
>> }
>
> I don't think we need to handle that, but if you want, please put the
> WARN inside the conditional (so say "if (WARN_ON(!blk_size)) {...}". And
> don't make it _ONCE; the ordinary version is slightly cheaper (since
> there's no static bool __warned and no code to manage that).
Ok, I'll do that in a separate patch and hope Andy is not going to
complain about not adding _ONCE.
>
>>
>> - exp = divisor[units] / blk_size;
>> /*
>> - * size must be strictly greater than exp here to ensure that remainder
>> - * is greater than divisor[units] coming out of the if below.
>> + * size can be huge and doing size * blk_size right away can overflow.
>> + * As a first step reduce huge size to something less than
>> + * blk_size * divisor[units].
>> */
>> - if (size > exp) {
>> + while (size > (u64)blk_size * divisor[units]) {
>
> It seems weird that we reduce _more_ for smaller block sizes - indeed,
> for blk_size==1 we end up reducing size to <= 1000 (or 1024), which is
> certainly a good way to lose some precision. Also, this relies on
> blk_size being smaller than (roughly) sqrt(U64_MAX/divisor[units]),
> which is of course true in practice, but a slightly annoying implicit
> assumption.
>
> I do think that my approach of reducing size till it's smaller than
> U64_MAX/blk_size is simpler and better. There's much less fixup
> code. It's simply
>
> while (size > div_u64(U64_MAX, blk_size) {
> do_div(size, divisor[units]);
> ++i;
> }
> size *= blk_size;
> while (size > divisor[units]) {
> remainder = do_div(size, divisor[units]);
> ++i;
> }
>
> which is as self-explaining as it gets.
>
OK, let me borrow this idea from you and change Reported-by: tag with
Suggested-by:. I'll test and send v3.
> And yes, one needs to use the include/linux/math64.h functions/macros
> for u64/u32 divisions - otherwise I'm pretty sure one will get friendly
> mails from the build bot.
>
> while (size > )
>> remainder = do_div(size, divisor[units]);
>> - remainder *= blk_size;
>> i++;
>> - } else {
>> - remainder *= size;
>> }
>>
>> + /* Now we're OK with doing size * blk_size, it won't overflow. */
>> size *= blk_size;
>> + remainder *= blk_size;
>> + /*
>> + * We were doing partial multiplication by blk_size.
>> + * remainder >= divisor[units] here means size should be increased.
>> + */
>> size += remainder / divisor[units];
>> - remainder %= divisor[units];
>> + remainder -= (remainder / divisor[units]) * divisor[units];
>>
>> + /*
>> + * Normalize. size >= divisor[units] means we still have enough
>> + * precision and dropping remainder is fine.
>> + */
>> while (size >= divisor[units]) {
>> remainder = do_div(size, divisor[units]);
>> i++;
>> @@ -87,7 +96,8 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>> if (j) {
>> remainder *= 1000;
>> remainder /= divisor[units];
>> - snprintf(tmp, sizeof(tmp), ".%03u", remainder);
>> + /* remainder is < divisor[units] here, (u32) is legit */
>
> What is actually important is that remainder is < 1000. remainder was
> initially < divisor[units], but then the multiplication and division
> transformed that into "1/1000s" of whatever unit we're using.
>
>> + snprintf(tmp, sizeof(tmp), ".%03u", (u32)remainder);
>> tmp[j+1] = '\0';
>> }
>>
>> @@ -97,6 +107,7 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>> else
>> unit = units_str[units][i];
>>
>> + /* size is < divisor[units] here, (u32) is legit */
>> snprintf(buf, len, "%u%s %s", (u32)size,
>> tmp, unit);
>> }
--
Vitaly
next prev parent reply other threads:[~2015-10-29 13:51 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-27 14:06 [PATCH v2 0/3] lib/string_helpers: fix precision issues and introduce tests Vitaly Kuznetsov
2015-10-27 14:06 ` [PATCH v2 1/3] lib/string_helpers: change blk_size to u32 for string_get_size() interface Vitaly Kuznetsov
2015-10-27 14:06 ` [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size() Vitaly Kuznetsov
2015-10-27 15:25 ` Andy Shevchenko
2015-10-27 16:16 ` Vitaly Kuznetsov
2015-10-27 16:24 ` Andy Shevchenko
2015-10-27 21:02 ` Rasmus Villemoes
2015-10-29 13:51 ` Vitaly Kuznetsov [this message]
2015-10-27 14:06 ` [PATCH v2 3/3] lib/test-string_helpers.c: add string_get_size() tests Vitaly Kuznetsov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87eggdzst7.fsf@vitty.brq.redhat.com \
--to=vkuznets@redhat.com \
--cc=JBottomley@Odin.com \
--cc=akpm@linux-foundation.org \
--cc=andriy.shevchenko@linux.intel.com \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@rasmusvillemoes.dk \
--cc=ulf.hansson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.