public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "George Spelvin" <linux@horizon.com>
To: akpm@linux-foundation.org, linux@horizon.com,
	peterz@infradead.org, zengzhaoxiu@163.com
Cc: linux-kernel@vger.kernel.org, zhaoxiu.zeng@gmail.com
Subject: Re: [patch V2] lib: GCD: add binary GCD algorithm
Date: 27 Apr 2016 12:20:33 -0400	[thread overview]
Message-ID: <20160427162033.11271.qmail@ns.horizon.com> (raw)
In-Reply-To: <1461744350-24156-1-git-send-email-zengzhaoxiu@163.com>

I replicated your results in 32- and 64-bit x86:

- If __ffs (gcc's __builtin_ctz) is available, the basic
  (non-even/odd) binary GCD algorithm is faster than the
  divsion-based or even/odd.
- If shifting down has to be done in a loop, the even/odd
  binary algorithm is fastest.

I tried a few variants, but couldn't consistently beat your code.
The following variant, with the loop test at the end, seemed to do better
in some cases, but it's not consistent enought to be sure:

static unsigned long gcd5(unsigned long a, unsigned long b)
{
	unsigned long r = a | b;

	if (!a || !b)
		return r;

	b >>= __builtin_ctzl(b);

	do {
		a >>= __builtin_ctzl(a);
		if (a < b)
			swap(a, b);
		a -= b;
	} while (a);
	return b << __builtin_ctzl(r);
}


The frequent "if (USE_FFS)" conditions seem to clutter the code.
Would it be easier to read if the two variants were just separated?
The function is short enough that the duplication isn't too severe.

Such as the following.

A few source cleanups (included for example; delete if you don't
like them):
- Return directly from the loop, rather than using break().
- Use "r &= -r" mostly because it's clearer.
Signed-off-by: George Spelvin <linux@horizon.com>

/*
 * This implements the binary GCD algorithm. (Often attributed to Stein,
 * but as Knuth has noted, appears a first-century Chinese math text.)
 *
 * This is faster than the division-based algorithm even on x86, which
 * has decent hardware division.
 */
#if USE_FFS
/* If __ffs is available, the even/odd algorithm benchmarks slower. */
unsigned long gcd(unsigned long a, unsigned long b)
{
	unsigned long r = a | b;

	if (!a || !b)
		return r;

	b >>= __ffs(b);

	for (;;) {
		a >>= __ffs(a);
		if (a == b)
			return a << __ffs(r);
		if (a < b)
			swap(a, b);
		a -= b;
	}
}

#else /* !USE_FFS */

/* If normalization is done by loops, the even/odd algorithm is a win. */
unsigned long gcd(unsigned long a, unsigned long b)
{
	unsigned long r = a | b;

	if (!a || !b)
		return r;

	r &= -r;	/* Isolate lsbit of r */

	while (!(b & r))
		b >>= 1;

	for (;;) {
		while (!(a & r))
			a >>= 1;
		if (a == b)
			return a;
		if (a < b)
			swap(a, b);
		a -= b;
		a >>= 1;
		if (a & r)
			a += b;
		a >>= 1;
	}
}
#endif

  reply	other threads:[~2016-04-27 16:20 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-27  8:05 [patch V2] lib: GCD: add binary GCD algorithm zengzhaoxiu
2016-04-27 16:20 ` George Spelvin [this message]
2016-04-28  7:21   ` Peter Zijlstra
2016-04-27 20:08 ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160427162033.11271.qmail@ns.horizon.com \
    --to=linux@horizon.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=zengzhaoxiu@163.com \
    --cc=zhaoxiu.zeng@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox