From: David Laight <david.laight.linux@gmail.com>
To: Andrew Morton <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org
Cc: David Laight <david.laight.linux@gmail.com>,
u.kleine-koenig@baylibre.com, Nicolas Pitre <npitre@baylibre.com>,
Oleg Nesterov <oleg@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Biju Das <biju.das.jz@bp.renesas.com>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Li RongQing <lirongqing@baidu.com>, Yu Kuai <yukuai3@huawei.com>,
Khazhismel Kumykov <khazhy@chromium.org>,
Jens Axboe <axboe@kernel.dk>,
x86@kernel.org
Subject: [PATCH v4 next 0/9] Implement mul_u64_u64_div_u64_roundup()
Date: Wed, 29 Oct 2025 17:38:19 +0000 [thread overview]
Message-ID: <20251029173828.3682-1-david.laight.linux@gmail.com> (raw)
The pwm-stm32.c code wants a 'rounding up' version of mul_u64_u64_div_u64().
This can be done simply by adding 'divisor - 1' to the 128bit product.
Implement mul_u64_add_u64_div_u64(a, b, c, d) = (a * b + c)/d based on the
existing code.
Define mul_u64_u64_div_u64(a, b, d) as mul_u64_add_u64_div_u64(a, b, 0, d) and
mul_u64_u64_div_u64_roundup(a, b, d) as mul_u64_add_u64_div_u64(a, b, d-1, d).
Only x86-64 has an optimsed (asm) version of the function.
That is optimised to avoid the 'add c' when c is known to be zero.
In all other cases the extra code will be noise compared to the software
divide code.
The test module has been updated to test mul_u64_u64_div_u64_roundup() and
also enhanced it to verify the C division code on x86-64 and the 32bit
division code on 64bit.
Changes for v2:
- Rename the 'divisor' parameter from 'c' to 'd'.
- Add an extra patch to use BUG_ON() to trap zero divisors.
- Remove the last patch that ran the C code on x86-64
(I've a plan to do that differently).
Changes for v3:
- Replace the BUG_ON() (or panic in the original version) for zero
divisors with a WARN_ONCE() and return zero.
- Remove the 'pre-multiply' check for small products.
Completely non-trivial on 32bit systems.
- Use mul_u32_u32() and the new add_u64_u32() to stop gcc generating
pretty much pessimal code for x86 with lots of register spills.
- Replace the 'bit at a time' divide with one that generates 16 bits
per iteration on 32bit systems and 32 bits per iteration on 64bit.
Massively faster, the tests run in under 1/3 the time.
Changes for v4:
No significant code changes.
- Rebase on 6.18-rc2
- Don't change the behaviour for overflow (return ~0ull) or divide
by zero (execute ~0ul/0).
- Merge patches 8 and 9 to avoid bisection issues.
- Fix build of 32bit test cases on non-x86.
- Fix shell script that verifies test cases.
I've left the x86-64 faulting on both overflow and divide by zero.
The patch to add an execption table entry to return ~0 for both
doesn't seem to have been merged.
If merged it would make sense for the C version to return ~0 for both.
Callers can check for a result of ~0 and then check the divisor if
they care about overflow (etc).
(A valid quotent of ~0 is pretty unlikely and marginal changes to the
input values are likely to generate a real overflow.)
The code that faulted on overflow was about to get invalid results
because one of the 64bit inputs would itself wrap very soon.
David Laight (9):
lib: mul_u64_u64_div_u64() rename parameter 'c' to 'd'
lib: mul_u64_u64_div_u64() Combine overflow and divide by zero checks
lib: mul_u64_u64_div_u64() simplify check for a 64bit product
lib: Add mul_u64_add_u64_div_u64() and mul_u64_u64_div_u64_roundup()
lib: Add tests for mul_u64_u64_div_u64_roundup()
lib: test_mul_u64_u64_div_u64: Test both generic and arch versions
lib: mul_u64_u64_div_u64() optimise multiply on 32bit x86
lib: mul_u64_u64_div_u64() Optimise the divide code
lib: test_mul_u64_u64_div_u64: Test the 32bit code on 64bit
arch/x86/include/asm/div64.h | 39 ++++--
include/linux/math64.h | 59 ++++++++-
lib/math/div64.c | 183 ++++++++++++++++++---------
lib/math/test_mul_u64_u64_div_u64.c | 190 ++++++++++++++++++++--------
4 files changed, 352 insertions(+), 119 deletions(-)
--
2.39.5
next reply other threads:[~2025-10-29 17:39 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-29 17:38 David Laight [this message]
2025-10-29 17:38 ` [PATCH v4 next 1/9] lib: mul_u64_u64_div_u64() rename parameter 'c' to 'd' David Laight
2025-10-29 17:38 ` [PATCH v4 next 2/9] lib: mul_u64_u64_div_u64() Combine overflow and divide by zero checks David Laight
2025-10-29 18:02 ` Nicolas Pitre
2025-10-29 17:38 ` [PATCH v4 next 3/9] lib: mul_u64_u64_div_u64() simplify check for a 64bit product David Laight
2025-10-29 18:11 ` Nicolas Pitre
2025-10-31 9:19 ` David Laight
2025-10-31 17:26 ` Nicolas Pitre
2025-10-31 18:04 ` David Laight
2025-10-31 18:45 ` Nicolas Pitre
2025-10-31 20:12 ` David Laight
2025-10-29 17:38 ` [PATCH v4 next 4/9] lib: Add mul_u64_add_u64_div_u64() and mul_u64_u64_div_u64_roundup() David Laight
2025-10-29 18:17 ` Nicolas Pitre
2025-10-31 20:59 ` David Laight
2025-11-01 2:12 ` Andrew Morton
2025-10-29 17:38 ` [PATCH v4 next 5/9] lib: Add tests for mul_u64_u64_div_u64_roundup() David Laight
2025-10-29 18:26 ` Nicolas Pitre
2025-10-29 17:38 ` [PATCH v4 next 6/9] lib: test_mul_u64_u64_div_u64: Test both generic and arch versions David Laight
2025-10-29 18:53 ` Nicolas Pitre
2025-11-01 19:35 ` kernel test robot
2025-11-01 20:59 ` kernel test robot
2025-11-02 10:36 ` David Laight
2025-10-29 17:38 ` [PATCH v4 next 7/9] lib: mul_u64_u64_div_u64() optimise multiply on 32bit x86 David Laight
2025-10-29 19:01 ` Nicolas Pitre
2025-10-29 17:38 ` [PATCH v4 next 8/9] lib: mul_u64_u64_div_u64() Optimise the divide code David Laight
2025-10-29 20:47 ` Nicolas Pitre
2025-10-29 17:38 ` [PATCH v4 next 9/9] lib: test_mul_u64_u64_div_u64: Test the 32bit code on 64bit David Laight
2025-10-29 20:48 ` Nicolas Pitre
2025-10-31 4:29 ` [PATCH v4 next 0/9] Implement mul_u64_u64_div_u64_roundup() Andrew Morton
2025-11-04 17:16 ` Nicolas Pitre
2025-10-31 13:52 ` Oleg Nesterov
2025-10-31 16:17 ` David Laight
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251029173828.3682-1-david.laight.linux@gmail.com \
--to=david.laight.linux@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=biju.das.jz@bp.renesas.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=khazhy@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lirongqing@baidu.com \
--cc=mingo@redhat.com \
--cc=npitre@baylibre.com \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=u.kleine-koenig@baylibre.com \
--cc=x86@kernel.org \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox