From: Alexey.Brodkin@synopsys.com (Alexey Brodkin)
To: linux-snps-arc@lists.infradead.org
Subject: [PATCH] __div64_32: implement division by multiplication for 32-bit arches
Date: Thu, 29 Oct 2015 14:32:23 +0000 [thread overview]
Message-ID: <1446129143.3203.19.camel@synopsys.com> (raw)
In-Reply-To: <20151029133122.GK8644@n2100.arm.linux.org.uk>
Hi Russel,
On Thu, 2015-10-29@13:31 +0000, Russell King - ARM Linux wrote:
> On Thu, Oct 29, 2015@01:47:35AM +0300, Alexey Brodkin wrote:
> > diff --git a/lib/div64.c b/lib/div64.c
> > index 62a698a..3055328 100644
> > --- a/lib/div64.c
> > +++ b/lib/div64.c
> > +/*
> > + * If the divisor happens to be constant, we determine the appropriate
> > + * inverse at compile time to turn the division into a few inline
> > + * multiplications instead which is much faster.
> > + */
> > uint32_t __attribute__((weak)) __div64_32(uint64_t *n, uint32_t base)
> > {
> > - uint64_t rem = *n;
> > - uint64_t b = base;
> > - uint64_t res, d = 1;
> > - uint32_t high = rem >> 32;
> > -
> > - /* Reduce the thing a bit first */
> > - res = 0;
> > - if (high >= base) {
> > - high /= base;
> > - res = (uint64_t) high << 32;
> > - rem -= (uint64_t) (high*base) << 32;
> > - }
> > + unsigned int __r, __b = base;
> >
> > - while ((int64_t)b > 0 && b < rem) {
> > - b = b+b;
> > - d = d+d;
> > - }
> > + if (!__builtin_constant_p(__b) || __b == 0) {
>
> Can you explain who __builtin_constant_p(__b) can be anything but false
> here? I can't see that this will ever be true.
>
> This is a function in its own .c file - the compiler will have no
> knowledge about the callers of this function scattered throughout the
> kernel, and it has to assume that the 'base' argument to this function
> is variable. So, __builtin_constant_p(__b) will always be false, which
> means this if () statement will always be true and the else clause will
> never be used.
Essentially constant propagation will only happen if __div64_32() is inlined.
For that we need to add "inline" specifier to __div64_32(), but that in its
turn will prevent use of arch-specific more optimal __div64_32() implementation.
And that was my main question how to implement this properly: have better generic
do_div() or __div64_32() as its heavy lifting part and still keep an ability for
some architectures to use their own implementations.
-Alexey
WARNING: multiple messages have this Message-ID (diff)
From: Alexey Brodkin <Alexey.Brodkin@synopsys.com>
To: "linux@arm.linux.org.uk" <linux@arm.linux.org.uk>
Cc: "nicolas.pitre@linaro.org" <nicolas.pitre@linaro.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"Vineet.Gupta1@synopsys.com" <Vineet.Gupta1@synopsys.com>,
"shemminger@linux-foundation.org"
<shemminger@linux-foundation.org>,
"mingo@elte.hu" <mingo@elte.hu>,
"linux-snps-arc@lists.infradead.org"
<linux-snps-arc@lists.infradead.org>,
"davem@davemloft.net" <davem@davemloft.net>
Subject: Re: [PATCH] __div64_32: implement division by multiplication for 32-bit arches
Date: Thu, 29 Oct 2015 14:32:23 +0000 [thread overview]
Message-ID: <1446129143.3203.19.camel@synopsys.com> (raw)
In-Reply-To: <20151029133122.GK8644@n2100.arm.linux.org.uk>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2179 bytes --]
Hi Russel,
On Thu, 2015-10-29 at 13:31 +0000, Russell King - ARM Linux wrote:
> On Thu, Oct 29, 2015 at 01:47:35AM +0300, Alexey Brodkin wrote:
> > diff --git a/lib/div64.c b/lib/div64.c
> > index 62a698a..3055328 100644
> > --- a/lib/div64.c
> > +++ b/lib/div64.c
> > +/*
> > + * If the divisor happens to be constant, we determine the appropriate
> > + * inverse at compile time to turn the division into a few inline
> > + * multiplications instead which is much faster.
> > + */
> > uint32_t __attribute__((weak)) __div64_32(uint64_t *n, uint32_t base)
> > {
> > - uint64_t rem = *n;
> > - uint64_t b = base;
> > - uint64_t res, d = 1;
> > - uint32_t high = rem >> 32;
> > -
> > - /* Reduce the thing a bit first */
> > - res = 0;
> > - if (high >= base) {
> > - high /= base;
> > - res = (uint64_t) high << 32;
> > - rem -= (uint64_t) (high*base) << 32;
> > - }
> > + unsigned int __r, __b = base;
> >
> > - while ((int64_t)b > 0 && b < rem) {
> > - b = b+b;
> > - d = d+d;
> > - }
> > + if (!__builtin_constant_p(__b) || __b == 0) {
>
> Can you explain who __builtin_constant_p(__b) can be anything but false
> here? I can't see that this will ever be true.
>
> This is a function in its own .c file - the compiler will have no
> knowledge about the callers of this function scattered throughout the
> kernel, and it has to assume that the 'base' argument to this function
> is variable. So, __builtin_constant_p(__b) will always be false, which
> means this if () statement will always be true and the else clause will
> never be used.
Essentially constant propagation will only happen if __div64_32() is inlined.
For that we need to add "inline" specifier to __div64_32(), but that in its
turn will prevent use of arch-specific more optimal __div64_32() implementation.
And that was my main question how to implement this properly: have better generic
do_div() or __div64_32() as its heavy lifting part and still keep an ability for
some architectures to use their own implementations.
-Alexey
ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥
next prev parent reply other threads:[~2015-10-29 14:32 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-28 22:47 [PATCH] __div64_32: implement division by multiplication for 32-bit arches Alexey Brodkin
2015-10-28 22:47 ` Alexey Brodkin
2015-10-28 23:32 ` Nicolas Pitre
2015-10-28 23:32 ` Nicolas Pitre
2015-10-29 7:34 ` Alexey Brodkin
2015-10-29 7:34 ` Alexey Brodkin
2015-10-30 1:26 ` Nicolas Pitre
2015-10-30 1:26 ` Nicolas Pitre
2015-10-30 5:41 ` Vineet Gupta
2015-10-30 5:41 ` Vineet Gupta
2015-10-30 12:41 ` Måns Rullgård
2015-10-30 12:41 ` Måns Rullgård
2015-10-30 12:40 ` Måns Rullgård
2015-10-30 12:40 ` Måns Rullgård
2015-10-30 15:17 ` Nicolas Pitre
2015-10-30 15:17 ` Nicolas Pitre
2015-10-30 15:54 ` Alexey Brodkin
2015-10-30 15:54 ` Alexey Brodkin
2015-10-30 16:55 ` Nicolas Pitre
2015-10-30 16:55 ` Nicolas Pitre
2015-10-30 17:45 ` Måns Rullgård
2015-10-30 17:45 ` Måns Rullgård
2015-11-04 23:46 ` Nicolas Pitre
2015-11-04 23:46 ` Nicolas Pitre
2015-11-04 23:48 ` Nicolas Pitre
2015-11-04 23:48 ` Nicolas Pitre
2015-11-05 3:13 ` Vineet Gupta
2015-11-05 3:13 ` Vineet Gupta
2015-11-05 5:06 ` Nicolas Pitre
2015-11-05 5:06 ` Nicolas Pitre
2015-11-04 23:49 ` Måns Rullgård
2015-11-04 23:49 ` Måns Rullgård
2015-10-30 14:28 ` Alexey Brodkin
2015-10-30 14:28 ` Alexey Brodkin
2015-10-29 0:36 ` kbuild test robot
2015-10-29 0:36 ` kbuild test robot
2015-10-29 12:52 ` Måns Rullgård
2015-10-29 12:52 ` Måns Rullgård
2015-10-29 13:05 ` Alexey Brodkin
2015-10-29 13:05 ` Alexey Brodkin
2015-10-29 13:37 ` Måns Rullgård
2015-10-29 13:37 ` Måns Rullgård
2015-10-29 13:31 ` Russell King - ARM Linux
2015-10-29 13:31 ` Russell King - ARM Linux
2015-10-29 14:32 ` Alexey Brodkin [this message]
2015-10-29 14:32 ` Alexey Brodkin
2015-10-29 17:09 ` Randy Dunlap
2015-10-29 17:09 ` Randy Dunlap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1446129143.3203.19.camel@synopsys.com \
--to=alexey.brodkin@synopsys.com \
--cc=linux-snps-arc@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.