linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Ulf Hansson <ulf.hansson@linaro.org>,
	James Bottomley <JBottomley@Odin.com>,
	Kees Cook <keescook@chromium.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size()
Date: Thu, 29 Oct 2015 14:51:32 +0100	[thread overview]
Message-ID: <87eggdzst7.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <87twpcvxc6.fsf@rasmusvillemoes.dk> (Rasmus Villemoes's message of "Tue, 27 Oct 2015 22:02:49 +0100")

Rasmus Villemoes <linux@rasmusvillemoes.dk> writes:

> On Tue, Oct 27 2015, Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>
>> string_get_size() loses precision when there is a remainder for
>> blk_size / divisor[units] and size is big enough. E.g
>> string_get_size(8192, 4096, STRING_UNITS_10, ...) returns "32.7 MB"
>> while it is supposed to return "33.5 MB". For some artificial inputs
>> the result can be ridiculously wrong, e.g.
>> string_get_size(3000, 1900, STRING_UNITS_10, ...) returns "3.00 MB"
>> when "5.70 MB" is expected.
>>
>> The issues comes from the fact than we through away
>> blk_size / divisor[units] remainder when size is > exp. This can be fixed
>> by saving it and doing some non-trivial calculations later to fix the error
>> but that would make this function even more cumbersome. Slightly re-factor
>> the function to not lose the precision for all inputs.
>>
>> The overall complexity of this function comes from the fact that size can
>> be huge and we don't want to do size * blk_size as it can overflow. Do the
>> math in two steps:
>> 1) Reduce size to something < blk_size * divisor[units]
>> 2) Multiply the result (and the remainder) by blk_size and do final
>>    calculations.
>>
>> Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>> Changes since v1:
>> - Check against blk_size == 0 [Rasmus Villemoes]
>> - Do not rename 'i' to 'order' [Andy Shevchenko]
>> ---
>>  lib/string_helpers.c | 37 ++++++++++++++++++++++++-------------
>>  1 file changed, 24 insertions(+), 13 deletions(-)
>>
>> diff --git a/lib/string_helpers.c b/lib/string_helpers.c
>> index f6c27dc..eba8e82 100644
>> --- a/lib/string_helpers.c
>> +++ b/lib/string_helpers.c
>> @@ -44,7 +44,8 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>>  		[STRING_UNITS_2] = 1024,
>>  	};
>>  	int i, j;
>> -	u32 remainder = 0, sf_cap, exp;
>> +	u64 remainder = 0;
>> +	u32 sf_cap;
>>  	char tmp[8];
>>  	const char *unit;
>>  
>> @@ -53,28 +54,36 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>>  	if (!size)
>>  		goto out;
>>  
>> -	while (blk_size >= divisor[units]) {
>> -		remainder = do_div(blk_size, divisor[units]);
>> -		i++;
>> +	if (!blk_size) {
>> +		WARN_ON(1);
>> +		size = 0;
>> +		goto out;
>>  	}
>
> I don't think we need to handle that, but if you want, please put the
> WARN inside the conditional (so say "if (WARN_ON(!blk_size)) {...}". And
> don't make it _ONCE; the ordinary version is slightly cheaper (since
> there's no static bool __warned and no code to manage that).

Ok, I'll do that in a separate patch and hope Andy is not going to
complain about not adding _ONCE.

>
>>  
>> -	exp = divisor[units] / blk_size;
>>  	/*
>> -	 * size must be strictly greater than exp here to ensure that remainder
>> -	 * is greater than divisor[units] coming out of the if below.
>> +	 * size can be huge and doing size * blk_size right away can overflow.
>> +	 * As a first step reduce huge size to something less than
>> +	 * blk_size * divisor[units].
>>  	 */
>> -	if (size > exp) {
>> +	while (size > (u64)blk_size * divisor[units]) {
>
> It seems weird that we reduce _more_ for smaller block sizes - indeed,
> for blk_size==1 we end up reducing size to <= 1000 (or 1024), which is
> certainly a good way to lose some precision. Also, this relies on
> blk_size being smaller than (roughly) sqrt(U64_MAX/divisor[units]),
> which is of course true in practice, but a slightly annoying implicit
> assumption.
>
> I do think that my approach of reducing size till it's smaller than
> U64_MAX/blk_size is simpler and better. There's much less fixup
> code. It's simply
>
>   while (size > div_u64(U64_MAX, blk_size) {
>     do_div(size, divisor[units]);
>     ++i;
>   }
>   size *= blk_size;
>   while (size > divisor[units]) {
>     remainder = do_div(size, divisor[units]);
>     ++i;
>   }
>
> which is as self-explaining as it gets.
>

OK, let me borrow this idea from you and change Reported-by: tag with
Suggested-by:. I'll test and send v3.

> And yes, one needs to use the include/linux/math64.h functions/macros
> for u64/u32 divisions - otherwise I'm pretty sure one will get friendly
> mails from the build bot.
>
>   while (size > )
>>  		remainder = do_div(size, divisor[units]);
>> -		remainder *= blk_size;
>>  		i++;
>> -	} else {
>> -		remainder *= size;
>>  	}
>>  
>> +	/* Now we're OK with doing size * blk_size, it won't overflow. */
>>  	size *= blk_size;
>> +	remainder *= blk_size;
>> +	/*
>> +	 * We were doing partial multiplication by blk_size.
>> +	 * remainder >= divisor[units] here means size should be increased.
>> +	 */
>>  	size += remainder / divisor[units];
>> -	remainder %= divisor[units];
>> +	remainder -= (remainder / divisor[units]) * divisor[units];
>>  
>> +	/*
>> +	 * Normalize. size >= divisor[units] means we still have enough
>> +	 * precision and dropping remainder is fine.
>> +	 */
>>  	while (size >= divisor[units]) {
>>  		remainder = do_div(size, divisor[units]);
>>  		i++;
>> @@ -87,7 +96,8 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>>  	if (j) {
>>  		remainder *= 1000;
>>  		remainder /= divisor[units];
>> -		snprintf(tmp, sizeof(tmp), ".%03u", remainder);
>> +		/* remainder is < divisor[units] here, (u32) is legit */
>
> What is actually important is that remainder is < 1000. remainder was
> initially < divisor[units], but then the multiplication and division
> transformed that into "1/1000s" of whatever unit we're using. 
>
>> +		snprintf(tmp, sizeof(tmp), ".%03u", (u32)remainder);
>>  		tmp[j+1] = '\0';
>>  	}
>>  
>> @@ -97,6 +107,7 @@ void string_get_size(u64 size, u32 blk_size, const enum string_size_units units,
>>  	else
>>  		unit = units_str[units][i];
>>  
>> +	/* size is < divisor[units] here, (u32) is legit */
>>  	snprintf(buf, len, "%u%s %s", (u32)size,
>>  		 tmp, unit);
>>  }

-- 
  Vitaly

  reply	other threads:[~2015-10-29 13:51 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-27 14:06 [PATCH v2 0/3] lib/string_helpers: fix precision issues and introduce tests Vitaly Kuznetsov
2015-10-27 14:06 ` [PATCH v2 1/3] lib/string_helpers: change blk_size to u32 for string_get_size() interface Vitaly Kuznetsov
2015-10-27 14:06 ` [PATCH v2 2/3] lib/string_helpers.c: don't lose precision in string_get_size() Vitaly Kuznetsov
2015-10-27 15:25   ` Andy Shevchenko
2015-10-27 16:16     ` Vitaly Kuznetsov
2015-10-27 16:24       ` Andy Shevchenko
2015-10-27 21:02   ` Rasmus Villemoes
2015-10-29 13:51     ` Vitaly Kuznetsov [this message]
2015-10-27 14:06 ` [PATCH v2 3/3] lib/test-string_helpers.c: add string_get_size() tests Vitaly Kuznetsov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87eggdzst7.fsf@vitty.brq.redhat.com \
    --to=vkuznets@redhat.com \
    --cc=JBottomley@Odin.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=ulf.hansson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).