Re: [PATCH] tcp_cubic: use 32 bit math

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Stephen Hemminger <shemminger@linux-foundation.org>
To: Willy Tarreau <w@1wt.eu>
Cc: David Miller <davem@davemloft.net>,
	rkuhn@e18.physik.tu-muenchen.de, andi@firstfloor.org,
	dada1@cosmosbay.com, jengelh@linux01.gwdg.de,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH] tcp_cubic: use 32 bit math
Date: Wed, 21 Mar 2007 12:58:54 -0700	[thread overview]
Message-ID: <20070321125854.465cf4f9@freekitty> (raw)
In-Reply-To: <20070321191537.GA28914@1wt.eu>

On Wed, 21 Mar 2007 20:15:37 +0100
Willy Tarreau <w@1wt.eu> wrote:

> Hi Stephen,
> 
> On Wed, Mar 21, 2007 at 11:54:19AM -0700, Stephen Hemminger wrote:
> > On Tue, 13 Mar 2007 21:50:20 +0100
> > Willy Tarreau <w@1wt.eu> wrote:
> 
> [...] ( cut my boring part )
> 
> > > Here are the results classed by speed :
> > > 
> > > /* Sample output on a Pentium-M 600 MHz :
> > > 
> > > Function          clocks mean(us)  max(us)  std(us) Avg err size
> > > ncubic_tab0           79     0.66     7.20     1.04  0.613%  160
> > > ncubic_0div           84     0.70     7.64     1.57  4.521%  192
> > > ncubic_1div          178     1.48    16.27     1.81  0.443%  336
> > > ncubic_tab1          179     1.49    16.34     1.85  0.195%  320
> > > ncubic_ndiv3         263     2.18    24.04     3.59  0.250%  512
> > > ncubic_2div          270     2.24    24.70     2.77  0.187%  512
> > > ncubic32_1           359     2.98    32.81     3.59  0.238%  544
> > > ncubic_3div          361     2.99    33.08     3.79  0.170%  656
> > > ncubic32             364     3.02    33.29     3.51  0.247%  544
> > > ncubic               529     4.39    48.39     4.92  0.247%  720
> > > hcbrt                539     4.47    49.25     5.98  1.580%   96
> > > ocubic               732     4.93    61.83     7.22  0.274%  320
> > > acbrt                842     6.98    76.73     8.55  0.275%  192
> > > bictcp              1032     6.95    86.30     9.04  0.172%  768
> > > 
> 
> [...]
> 
> > The following version of div64_64 is faster because do_div already
> > optimized for the 32 bit case..
>

/* 64bit divisor, dividend and result. dynamic precision */
static uint64_t div64_64(uint64_t dividend, uint64_t divisor)
{
	uint32_t high, d;

	high = divisor >> 32;
	if (high) {
		unsigned int shift = fls(high);

		d = divisor >> shift;
		dividend >>= shift;
	} else
		d = divisor;

	do_div(dividend, d);

	return dividend;
}



> Cool, this is interesting because I first wanted to optimize it but did
> not find how to start with this. You seem to get very good results. BTW,
> you did not append your changes.
> 
> However, one thing I do not understand is why your avg error is about 1/3
> below the original one. Was there a precision bug in the original div_64_64
> or did you extend the values used in the test ?
> 
> Or perhaps you used -fast-math to build and the original cbrt() is less
> precise in this case ?

No, but I did use -mtune=pentiumm on the ULV
-- 
Stephen Hemminger <shemminger@linux-foundation.org>

next prev parent reply	other threads:[~2007-03-21 20:03 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-24  1:05 [RFC] div64_64 support Stephen Hemminger
2007-02-24 16:19 ` Sami Farin
2007-02-26 19:28   ` Stephen Hemminger
2007-02-26 19:39     ` David Miller
2007-02-26 20:09 ` Jan Engelhardt
2007-02-26 21:28   ` Stephen Hemminger
2007-02-27  1:20     ` H. Peter Anvin
2007-02-27  3:45       ` Segher Boessenkool
2007-02-26 22:31   ` Stephen Hemminger
2007-02-26 23:02     ` Jan Engelhardt
2007-02-26 23:44       ` Stephen Hemminger
2007-02-27  0:05         ` Jan Engelhardt
2007-02-27  0:07           ` Stephen Hemminger
2007-02-27  0:14             ` Jan Engelhardt
2007-02-27  6:21     ` Dan Williams
2007-03-03  2:31     ` Andi Kleen
2007-03-05 23:57       ` Stephen Hemminger
2007-03-06  0:25         ` David Miller
2007-03-06 13:36           ` Andi Kleen
2007-03-06 14:04           ` [RFC] div64_64 support II Andi Kleen
2007-03-06 17:43             ` Dagfinn Ilmari Mannsåker
2007-03-06 18:25               ` David Miller
2007-03-06 18:48             ` H. Peter Anvin
2007-03-06 13:34         ` [RFC] div64_64 support Andi Kleen
2007-03-06 14:19           ` Eric Dumazet
2007-03-06 14:45             ` Andi Kleen
2007-03-06 15:10               ` Roland Kuhn
2007-03-06 18:29                 ` Stephen Hemminger
2007-03-06 19:48                   ` Andi Kleen
2007-03-06 20:04                     ` Stephen Hemminger
2007-03-06 21:53                   ` Sami Farin
2007-03-06 22:24                     ` Sami Farin
2007-03-07  0:00                       ` Stephen Hemminger
2007-03-07  0:05                         ` David Miller
2007-03-07  0:05                         ` Sami Farin
2007-03-07 16:11                       ` Chuck Ebbert
2007-03-07 18:32                         ` Sami Farin
2007-03-08 18:23                       ` asm volatile [Was: [RFC] div64_64 support] Sami Farin
2007-03-08 22:01                         ` asm volatile David Miller
2007-03-06 21:58                   ` [RFC] div64_64 support David Miller
2007-03-06 22:47                     ` [PATCH] tcp_cubic: faster cube root Stephen Hemminger
2007-03-06 22:58                       ` cube root benchmark code Stephen Hemminger
2007-03-07  6:08                         ` Update to " Willy Tarreau
2007-03-08  1:07                           ` [PATCH] tcp_cubic: use 32 bit math Stephen Hemminger
2007-03-08  2:55                             ` David Miller
2007-03-08  3:10                               ` Stephen Hemminger
2007-03-08  3:51                                 ` David Miller
2007-03-10 11:48                                   ` Willy Tarreau
2007-03-12 21:11                                     ` Stephen Hemminger
2007-03-13 20:50                                       ` Willy Tarreau
2007-03-21 18:54                                         ` Stephen Hemminger
2007-03-21 19:15                                           ` Willy Tarreau
2007-03-21 19:58                                             ` Stephen Hemminger [this message]
2007-03-21 20:15                                             ` [PATCH 1/2] div64_64 optimization Stephen Hemminger
2007-03-21 20:17                                               ` [PATCH 2/2] tcp: cubic optimization Stephen Hemminger
2007-03-22 19:11                                                 ` David Miller
2007-03-22 19:11                                               ` [PATCH 1/2] div64_64 optimization David Miller
2007-03-08  4:16                                 ` [PATCH] tcp_cubic: use 32 bit math Willy Tarreau
2007-03-07  4:20                       ` [PATCH] tcp_cubic: faster cube root David Miller
2007-03-07 12:12                         ` Andi Kleen
2007-03-07 19:33                           ` David Miller
2007-03-06 18:50               ` [RFC] div64_64 support H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070321125854.465cf4f9@freekitty \
    --to=shemminger@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=jengelh@linux01.gwdg.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rkuhn@e18.physik.tu-muenchen.de \
    --cc=w@1wt.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.