All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Laight <david.laight.linux@gmail.com>
To: "Uwe Kleine-König" <u.kleine-koenig@baylibre.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, Nicolas Pitre <npitre@baylibre.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Biju Das <biju.das.jz@bp.renesas.com>
Subject: Re: [PATCH next 0/3] lib: Implement mul_u64_u64_div_u64_roundup()
Date: Fri, 16 May 2025 13:17:58 +0100	[thread overview]
Message-ID: <20250516131758.2a304d1b@pumpkin> (raw)
In-Reply-To: <5e7mp3nreaadppjhxhpffyff2d2ccwcjf2suonxe43eofngddu@qfufr2wiw7yn>

On Fri, 16 May 2025 11:47:58 +0200
Uwe Kleine-König <u.kleine-koenig@baylibre.com> wrote:

> Hello David,
> 
> On Sat, Apr 05, 2025 at 09:45:27PM +0100, David Laight wrote:
> > The pwm-stm32.c code wants a 'rounding up' version of mul_u64_u64_div_u64().
> > This can be done simply by adding 'divisor - 1' to the 128bit product.
> > Implement mul_u64_add_u64_div_u64(a, b, c, d) = (a * b + c)/d based on the
> > existing code.
> > Define mul_u64_u64_div_u64(a, b, d) as mul_u64_add_u64_div_u64(a, b, 0, d) and
> > mul_u64_u64_div_u64_roundup(a, b, d) as mul_u64_add_u64_div_u64(a, b, d-1, d).
> > 
> > Only x86-64 has an optimsed (asm) version of the function.
> > That is optimised to avoid the 'add c' when c is known to be zero.
> > In all other cases the extra code will be noise compared to the software
> > divide code.
> > 
> > I've updated the test module to test mul_u64_u64_div_u64_roundup() and
> > also enhanced it to verify the C division code on x86-64.
> > 
> > Note that the code generated by gcc (eg for 32bit x86) just for the multiply
> > is rather more horrid than one would expect (clang does better).
> > I dread to think how long the divide loop takes.
> > And I'm not at all sure the call in kernel/sched/cputime.c isn't in a
> > relatively common path (rather than just hardware initialisation).
> > 
> > David Laight (3):
> >   lib: Add mul_u64_add_u64_div_u64() and mul_u64_u64_div_u64_roundup()
> >   lib: Add tests for mul_u64_u64_div_u64_roundup()
> >   lib: Update the muldiv64 tests to verify the C on x86-64  
> 
> I wonder what happend to this series. I'd like to make use of
> mul_u64_u64_div_u64_roundup() so I'd be interested to get this into the
> mainline.

I've a WIP rewrite of the divide code, speeds it up considerably for
'not amd-64'.

IIRC (the test machine is powered off) the test cases with random data
do down from over 900 clocks to below 150 for x86-32.
The 64bit code (on x64-x64 but skipping the divide instruction) are ~70
clocks rather than nearer 180.
Both those clock counts are almost data independent.

The code relies on the cpu having an 'unsigned long/unsigned long'
instruction.

I got reasonable code for 32bit x64 running gcc 12.2 from debian.
Then copied it to godbolt and found gcc before 15.0 making a pigs
breakfast of it (all clang versions are fine).
So some rework.

I still need to actually plumb it into the kernel sources.

I do want to test the divide loop on x86-64, at least for 64bit.
Probably by including the code directly into the test module.

	David


  reply	other threads:[~2025-05-16 12:18 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-05 20:45 [PATCH next 0/3] lib: Implement mul_u64_u64_div_u64_roundup() David Laight
2025-04-05 20:45 ` [PATCH 1/3] lib: Add mul_u64_add_u64_div_u64() and mul_u64_u64_div_u64_roundup() David Laight
2025-04-06  1:46   ` Nicolas Pitre
2025-04-06  3:06     ` Nicolas Pitre
2025-04-06  9:35     ` David Laight
2025-04-06 12:30       ` David Laight
2025-04-05 20:45 ` [PATCH 2/3] lib: Add tests for mul_u64_u64_div_u64_roundup() David Laight
2025-04-06  1:47   ` Nicolas Pitre
2025-04-05 20:45 ` [PATCH 3/3] lib: Update the muldiv64 tests to verify the C on x86-64 David Laight
2025-04-06  2:26   ` Nicolas Pitre
2025-05-16  9:47 ` [PATCH next 0/3] lib: Implement mul_u64_u64_div_u64_roundup() Uwe Kleine-König
2025-05-16 12:17   ` David Laight [this message]
2025-05-16 15:49     ` Nicolas Pitre
2025-05-18 13:43       ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250516131758.2a304d1b@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=biju.das.jz@bp.renesas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npitre@baylibre.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=u.kleine-koenig@baylibre.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.