public inbox for linux-arch@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] 64-by-32 ddivision optimization for constant divisors on 32-bit machines
@ 2015-11-02 22:33 Nicolas Pitre
  2015-11-02 22:33 ` Nicolas Pitre
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Nicolas Pitre @ 2015-11-02 22:33 UTC (permalink / raw)
  To: Alexey Brodkin, Måns Rullgård
  Cc: Arnd Bergmann, rmk+kernel, linux-arch, linux-kernel

This is a generalization of the optimization I produced for ARM a decade
ago to turn constant divisors into a multiplication by the divisor
reciprocal. Turns out that after all those years gcc is still not
optimizing things on its own for that case.

This has important performance benefits as discussed in this thread:

https://lkml.org/lkml/2015/10/28/851

This series brings the formerly ARMonly optimization to all 32-bit
architectures using C code by default.  The possibility for the actual
multiplication to be implemented in assembly is provided in order to get
optimal code.  The ARM version can be used as an example implementation
for other interested architectures to implement.


Nicolas

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2015-11-19 16:44 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-02 22:33 [PATCH 0/5] 64-by-32 ddivision optimization for constant divisors on 32-bit machines Nicolas Pitre
2015-11-02 22:33 ` Nicolas Pitre
2015-11-02 22:33 ` [PATCH 1/5] div64.h: optimize do_div() for power-of-two constant divisors Nicolas Pitre
2015-11-02 22:33   ` Nicolas Pitre
2015-11-02 22:33 ` [PATCH 2/5] do_div(): generic optimization for constant divisor on 32-bit machines Nicolas Pitre
2015-11-03  5:32   ` kbuild test robot
2015-11-03  9:15     ` Arnd Bergmann
2015-11-04 21:04       ` Nicolas Pitre
2015-11-04 21:42         ` Måns Rullgård
2015-11-04 21:42           ` Måns Rullgård
2015-11-02 22:33 ` [PATCH 3/5] __div64_const32(): abstract out the actual 128-bit cross product code Nicolas Pitre
2015-11-02 22:33   ` Nicolas Pitre
2015-11-02 22:33 ` [PATCH 4/5] __div64_32(): make it overridable at compile time Nicolas Pitre
2015-11-02 22:33   ` Nicolas Pitre
2015-11-02 22:33 ` [PATCH 5/5] ARM: asm/div64.h: adjust to generic codde Nicolas Pitre
2015-11-02 22:33   ` Nicolas Pitre
2015-11-03  1:25   ` kbuild test robot
2015-11-03  4:03     ` Nicolas Pitre
2015-11-03 21:39   ` kbuild test robot
2015-11-19 16:36   ` Måns Rullgård
2015-11-19 16:42     ` Nicolas Pitre
2015-11-19 16:44       ` Måns Rullgård

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox