linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 next 00/10] Implement mul_u64_u64_div_u64_roundup()
@ 2025-06-14  9:53 David Laight
  2025-06-14  9:53 ` [PATCH v3 next 01/10] lib: mul_u64_u64_div_u64() rename parameter 'c' to 'd' David Laight
                   ` (10 more replies)
  0 siblings, 11 replies; 44+ messages in thread
From: David Laight @ 2025-06-14  9:53 UTC (permalink / raw)
  To: Andrew Morton, linux-kernel
  Cc: David Laight, u.kleine-koenig, Nicolas Pitre, Oleg Nesterov,
	Peter Zijlstra, Biju Das

The pwm-stm32.c code wants a 'rounding up' version of mul_u64_u64_div_u64().
This can be done simply by adding 'divisor - 1' to the 128bit product.
Implement mul_u64_add_u64_div_u64(a, b, c, d) = (a * b + c)/d based on the
existing code.
Define mul_u64_u64_div_u64(a, b, d) as mul_u64_add_u64_div_u64(a, b, 0, d) and
mul_u64_u64_div_u64_roundup(a, b, d) as mul_u64_add_u64_div_u64(a, b, d-1, d).

Only x86-64 has an optimsed (asm) version of the function.
That is optimised to avoid the 'add c' when c is known to be zero.
In all other cases the extra code will be noise compared to the software
divide code.

The test module has been upadted to test mul_u64_u64_div_u64_roundup() and
also enhanced it to verify the C division code on x86-64 and the 32bit
division code on 64bit.

Changes for v2:
- Rename the 'divisor' parameter from 'c' to 'd'.
- Add an extra patch to use BUG_ON() to trap zero divisors.
- Remove the last patch that ran the C code on x86-64
  (I've a plan to do that differently).

Changes for v3:
- Replace the BUG_ON() (or panic in the original version) for zero
  divisors with a WARN_ONCE() and return zero.
- Remove the 'pre-multiply' check for small products.
  Completely non-trivial on 32bit systems.
- Use mul_u32_u32() and the new add_u64_u32() to stop gcc generating
  pretty much pessimal code for x86 with lots of register spills.
- Replace the 'bit at a time' divide with one that generates 16 bits
  per iteration on 32bit systems and 32 bits per iteration on 64bit.
  Massively faster, the tests run in under 1/3 the time.

David Laight (10):
  lib: mul_u64_u64_div_u64() rename parameter 'c' to 'd'
  lib: mul_u64_u64_div_u64() Use WARN_ONCE() for divide errors.
  lib: mul_u64_u64_div_u64() simplify check for a 64bit product
  lib: Add mul_u64_add_u64_div_u64() and mul_u64_u64_div_u64_roundup()
  lib: Add tests for mul_u64_u64_div_u64_roundup()
  lib: test_mul_u64_u64_div_u64: Test both generic and arch versions
  lib: mul_u64_u64_div_u64() optimise multiply on 32bit x86
  lib: mul_u64_u64_div_u64() Separate multiply to a helper for clarity
  lib: mul_u64_u64_div_u64() Optimise the divide code
  lib: test_mul_u64_u64_div_u64: Test the 32bit code on 64bit

 arch/x86/include/asm/div64.h        |  33 ++++-
 include/linux/math64.h              |  56 ++++++++-
 lib/math/div64.c                    | 189 +++++++++++++++++++---------
 lib/math/test_mul_u64_u64_div_u64.c | 187 +++++++++++++++++++--------
 4 files changed, 349 insertions(+), 116 deletions(-)

-- 
2.39.5


^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2025-07-14  7:06 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-14  9:53 [PATCH v3 next 00/10] Implement mul_u64_u64_div_u64_roundup() David Laight
2025-06-14  9:53 ` [PATCH v3 next 01/10] lib: mul_u64_u64_div_u64() rename parameter 'c' to 'd' David Laight
2025-06-14  9:53 ` [PATCH v3 next 02/10] lib: mul_u64_u64_div_u64() Use WARN_ONCE() for divide errors David Laight
2025-06-14 15:17   ` Nicolas Pitre
2025-06-14 21:26     ` David Laight
2025-06-14 22:23       ` Nicolas Pitre
2025-06-14  9:53 ` [PATCH v3 next 03/10] lib: mul_u64_u64_div_u64() simplify check for a 64bit product David Laight
2025-06-14 14:01   ` Nicolas Pitre
2025-06-14  9:53 ` [PATCH v3 next 04/10] lib: Add mul_u64_add_u64_div_u64() and mul_u64_u64_div_u64_roundup() David Laight
2025-06-14 14:06   ` Nicolas Pitre
2025-06-14  9:53 ` [PATCH v3 next 05/10] lib: Add tests for mul_u64_u64_div_u64_roundup() David Laight
2025-06-14 15:19   ` Nicolas Pitre
2025-06-17  4:30   ` Nicolas Pitre
2025-06-14  9:53 ` [PATCH v3 next 06/10] lib: test_mul_u64_u64_div_u64: Test both generic and arch versions David Laight
2025-06-14 15:25   ` Nicolas Pitre
2025-06-18  1:39     ` Nicolas Pitre
2025-06-14  9:53 ` [PATCH v3 next 07/10] lib: mul_u64_u64_div_u64() optimise multiply on 32bit x86 David Laight
2025-06-14 15:31   ` Nicolas Pitre
2025-06-14  9:53 ` [PATCH v3 next 08/10] lib: mul_u64_u64_div_u64() Separate multiply to a helper for clarity David Laight
2025-06-14 15:37   ` Nicolas Pitre
2025-06-14 21:30     ` David Laight
2025-06-14 22:27       ` Nicolas Pitre
2025-06-14  9:53 ` [PATCH v3 next 09/10] lib: mul_u64_u64_div_u64() Optimise the divide code David Laight
2025-06-17  4:16   ` Nicolas Pitre
2025-06-18  1:33     ` Nicolas Pitre
2025-06-18  9:16       ` David Laight
2025-06-18 15:39         ` Nicolas Pitre
2025-06-18 16:42           ` Nicolas Pitre
2025-06-18 17:54           ` David Laight
2025-06-18 20:12             ` Nicolas Pitre
2025-06-18 22:26               ` David Laight
2025-06-19  2:43                 ` Nicolas Pitre
2025-06-19  8:32                   ` David Laight
2025-06-26 21:46       ` David Laight
2025-06-27  3:48         ` Nicolas Pitre
2025-07-09 14:24   ` David Laight
2025-07-10  9:39     ` 答复: [????] " Li,Rongqing
2025-07-10 10:35       ` David Laight
2025-07-11 21:17     ` David Laight
2025-07-11 21:40       ` Nicolas Pitre
2025-07-14  7:06         ` David Laight
2025-06-14  9:53 ` [PATCH v3 next 10/10] lib: test_mul_u64_u64_div_u64: Test the 32bit code on 64bit David Laight
2025-06-14 10:27 ` [PATCH v3 next 00/10] Implement mul_u64_u64_div_u64_roundup() Peter Zijlstra
2025-06-14 11:59   ` David Laight

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).