All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Alexander van Heukelum <heukelum@mailshack.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	LKML <linux-kernel@vger.kernel.org>,
	heukelum@fastmail.fm
Subject: Re: [PATCH] x86: Change x86 to use generic find_next_bit
Date: Sun, 9 Mar 2008 21:10:16 +0100	[thread overview]
Message-ID: <20080309201016.GA28454@elte.hu> (raw)
In-Reply-To: <20080309200103.GA895@mailshack.com>


* Alexander van Heukelum <heukelum@mailshack.com> wrote:

> x86: Change x86 to use the generic find_next_bit implementation
> 
> The versions with inline assembly are in fact slower on the machines I 
> tested them on (in userspace) (Athlon XP 2800+, p4-like Xeon 2.8GHz, 
> AMD Opteron 270). The i386-version needed a fix similar to 06024f21 to 
> avoid crashing the benchmark.
> 
> Benchmark using: gcc -fomit-frame-pointer -Os. For each bitmap size 
> 1...512, for each possible bitmap with one bit set, for each possible 
> offset: find the position of the first bit starting at offset. If you 
> follow ;). Times include setup of the bitmap and checking of the 
> results.
> 
> 		Athlon		Xeon		Opteron 32/64bit
> x86-specific:	0m3.692s	0m2.820s	0m3.196s / 0m2.480s
> generic:	0m2.622s	0m1.662s	0m2.100s / 0m1.572s

ok, that's rather convincing.

the generic version in lib/find_next_bit.c is open-coded C which gcc can 
optimize pretty nicely.

the hand-coded assembly versions in arch/x86/lib/bitops_32.c mostly use 
the special x86 'bit search forward' (BSF) instruction - which i know 
from the days when the scheduler relied on it has some non-trivial setup 
costs. So especially when there's _small_ bitmasks involved, it's more 
expensive.

> If the bitmap size is not a multiple of BITS_PER_LONG, and no set 
> (cleared) bit is found, find_next_bit (find_next_zero_bit) returns a 
> value outside of the range [0,size]. The generic version always 
> returns exactly size. The generic version also uses unsigned long 
> everywhere, while the x86 versions use a mishmash of int, unsigned 
> (int), long and unsigned long.

i'm not surprised that the hand-coded assembly versions had a bug ...

[ this means we have to test it quite carefully though, as lots of code 
  only ever gets tested on x86 so code could have built dependency on 
  the buggy behavior. ]

> Using the generic version does give a slightly bigger kernel, though.
> 
> defconfig:	   text    data     bss     dec     hex filename
> x86-specific:	4738555  481232  626688 5846475  5935cb vmlinux (32 bit)
> generic:	4738621  481232  626688 5846541  59360d vmlinux (32 bit)
> x86-specific:	5392395  846568  724424 6963387  6a40bb vmlinux (64 bit)
> generic:	5392458  846568  724424 6963450  6a40fa vmlinux (64 bit)

i'd not worry about that too much. Have you tried to build with:

  CONFIG_CC_OPTIMIZE_FOR_SIZE=y
  CONFIG_OPTIMIZE_INLINING=y

(the latter only available in x86.git)

> Patch is against -x86#testing. It compiles.

i've picked it up into x86.git, lets see how it goes in practice.

	Ingo

  reply	other threads:[~2008-03-09 20:10 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-09 20:01 [PATCH] x86: Change x86 to use generic find_next_bit Alexander van Heukelum
2008-03-09 20:10 ` Ingo Molnar [this message]
2008-03-09 21:03   ` Andi Kleen
2008-03-09 21:32     ` Andi Kleen
2008-03-09 21:13   ` Alexander van Heukelum
2008-03-10  6:29     ` Ingo Molnar
2008-03-09 20:11 ` Ingo Molnar
2008-03-09 20:31   ` Alexander van Heukelum
2008-03-09 20:51     ` Ingo Molnar
2008-03-09 21:29       ` Andi Kleen
2008-03-10 23:17       ` [RFC/PATCH] x86: Optimize find_next_(zero_)bit for small constant-size bitmaps Alexander van Heukelum
2008-03-11  9:56         ` Ingo Molnar
2008-03-11 15:17           ` [PATCH] " Alexander van Heukelum
2008-03-11 15:22             ` [RFC] non-x86: " Alexander van Heukelum
2008-03-11 15:23             ` [PATCH] x86: " Ingo Molnar
2008-03-09 20:28 ` [PATCH] x86: Change x86 to use generic find_next_bit Andi Kleen
2008-03-09 21:31 ` Andi Kleen
2008-03-13 12:44 ` Aneesh Kumar K.V
2008-03-13 14:27   ` Alexander van Heukelum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080309201016.GA28454@elte.hu \
    --to=mingo@elte.hu \
    --cc=heukelum@fastmail.fm \
    --cc=heukelum@mailshack.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.