linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: 'Jules Bashizi Irenge' <jbi.octave@gmail.com>
Cc: "linux-perf-users@vger.kernel.org"
	<linux-perf-users@vger.kernel.org>,
	'Linus Torvalds' <torvalds@linux-foundation.org>
Subject: RE: [PATCH] perf/x86/amd/power: Use div64_u64 onstead of do_div()
Date: Thu, 2 May 2024 22:18:44 +0000	[thread overview]
Message-ID: <424d62cd084a4aaeb1067ee0af8e98ea@AcuMS.aculab.com> (raw)
In-Reply-To: <ZjPApuDc0gL9Pchm@octinomon.home>

> > > -	do_div(delta, tdelta);
> > > +	div64_u64(delta, tdelta);
> >
> > Nak - you've not tested it.
> >
> > 	David
> >
> No, I later realised it's for 32bit here. It may be very expensive for
> a 32bit.
> I think I got it wrong here.

You've missed the main issue.
do_div(a, b) sets 'a = a / b' and returns 'a % b'.
div64_u64(a, b) returns 'a / b'.
They are not direct substitutes.

For x86 (32 bit - does anyone care are more??) do_div is defined as:

/*
 * do_div() is NOT a C function. It wants to return
 * two values (the quotient and the remainder), but
 * since that doesn't work very well in C, what it
 * does is:
 *
 * - modifies the 64-bit dividend _in_place_
 * - returns the 32-bit remainder
 *
 * This ends up being the most efficient "calling
 * convention" on x86.
 */
#define do_div(n, base)						\
({								\
	unsigned long __upper, __low, __high, __mod, __base;	\
	__base = (base);					\
	if (__builtin_constant_p(__base) && is_power_of_2(__base)) { \
		__mod = n & (__base - 1);			\
		n >>= ilog2(__base);				\
	} else {						\
		asm("" : "=a" (__low), "=d" (__high) : "A" (n));\
		__upper = __high;				\
		if (__high) {					\
			__upper = __high % (__base);		\
			__high = __high / (__base);		\
		}						\
		asm("divl %2" : "=a" (__low), "=d" (__mod)	\
			: "rm" (__base), "0" (__low), "1" (__upper));	\
		asm("" : "=A" (n) : "a" (__low), "d" (__high));	\
	}							\
	__mod;							\
})

Somewhen it must have got changed from a simple asm wrapper for divl.
(And the test in the middle should be '__high > __base' to avoid
a division that is guaranteed to return zero.)

I'm also not at all sure what happens when divide gets speculatively
executed - there is a fair chance it has to complete.
So that extra check for returning 64bit quotients may be move
expensive that you might think.
(Although x86 will fault if the quotient would overflow - not ideal.)

I do wonder how much more it would cost to check the high 32bits
of a 64bit 'base' for being zero in a full 64x64 divide?
Any 'full software' 64bit divide function is going to do that anyway.

For 32bit it should be possible to do an inline 32x32 divide if
both 64bit values are only 32bit and an out of line call otherwise.
Although that might need to be built into gcc (at least for
architectures where the modulus isn't free) since gcc detects
when code does a/b and a%b and will only do one divide.

So maybe just supporting 64bit divide wouldn't be that expensive??

(Although there is a fubar on risc-v where even 'q = a / b'
'r = a - q * b' gets converted to 'r = a % b' and two slow
divides.)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


  reply	other threads:[~2024-05-02 22:19 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-28 16:40 [PATCH] perf/x86/amd/power: Use div64_u64 onstead of do_div() Jules Irenge
2024-05-02 16:25 ` David Laight
2024-05-02 16:34   ` Jules Bashizi Irenge
2024-05-02 22:18     ` David Laight [this message]
2024-05-02 22:59       ` Linus Torvalds
2024-05-04 22:53         ` David Laight
2024-05-05 19:38           ` Linus Torvalds
2024-05-06 16:39             ` David Laight
2024-05-06 16:45               ` Linus Torvalds
2024-05-06 17:26                 ` Linus Torvalds
2024-05-06 20:25                   ` David Laight
2024-05-06 17:43                 ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=424d62cd084a4aaeb1067ee0af8e98ea@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=jbi.octave@gmail.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).