All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alexander van Heukelum" <heukelum@fastmail.fm>
To: "Ingo Molnar" <mingo@elte.hu>,
	"Alexander van Heukelum" <heukelum@mailshack.com>
Cc: "Thomas Gleixner" <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"LKML" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] x86: Change x86 to use generic find_next_bit
Date: Sun, 09 Mar 2008 22:13:15 +0100	[thread overview]
Message-ID: <1205097195.13205.1241421773@webmail.messagingengine.com> (raw)
In-Reply-To: <20080309201016.GA28454@elte.hu>

On Sun, 9 Mar 2008 21:10:16 +0100, "Ingo Molnar" <mingo@elte.hu> said:
> > 		Athlon		Xeon		Opteron 32/64bit
> > x86-specific:	0m3.692s	0m2.820s	0m3.196s / 0m2.480s
> > generic:	0m2.622s	0m1.662s	0m2.100s / 0m1.572s
> 
> ok, that's rather convincing.
> 
> the generic version in lib/find_next_bit.c is open-coded C which gcc can 
> optimize pretty nicely.
> 
> the hand-coded assembly versions in arch/x86/lib/bitops_32.c mostly use 
> the special x86 'bit search forward' (BSF) instruction - which i know 
> from the days when the scheduler relied on it has some non-trivial setup 
> costs. So especially when there's _small_ bitmasks involved, it's more 
> expensive.

Hi,

BSF is fine, it doesn't need any special setup. The problem is probably
that the old versions use find_first_bit and find_first_zero_bit,
which are also hand optimized versions... and they use "repe scasl/q".
That's another little project ;).

> > If the bitmap size is not a multiple of BITS_PER_LONG, and no set 
> > (cleared) bit is found, find_next_bit (find_next_zero_bit) returns a 
> > value outside of the range [0,size]. The generic version always 
> > returns exactly size. The generic version also uses unsigned long 
> > everywhere, while the x86 versions use a mishmash of int, unsigned 
> > (int), long and unsigned long.
> 
> i'm not surprised that the hand-coded assembly versions had a bug ...

Not surprised about the bug, but it was in fact noticed, and fixed
in x86_64!

> [ this means we have to test it quite carefully though, as lots of code 
>   only ever gets tested on x86 so code could have built dependency on 
>   the buggy behavior. ]

Agreed.

> > Using the generic version does give a slightly bigger kernel, though.
> > 
> > defconfig:	   text    data     bss     dec     hex filename
> > x86-specific:	4738555  481232  626688 5846475  5935cb vmlinux (32 bit)
> > generic:	4738621  481232  626688 5846541  59360d vmlinux (32 bit)
> > x86-specific:	5392395  846568  724424 6963387  6a40bb vmlinux (64 bit)
> > generic:	5392458  846568  724424 6963450  6a40fa vmlinux (64 bit)
> 
> i'd not worry about that too much. Have you tried to build with:

I don't but I needed to compile something to test the build anyhow ;)

>   CONFIG_CC_OPTIMIZE_FOR_SIZE=y
>   CONFIG_OPTIMIZE_INLINING=y

This was defconfig in -x86#testing, they were both already enabled. 
Here is what you get with those options turned off ;).

                   text    data     bss     dec     hex filename
x86-specific:   5543996  481232  626688 6651916  65800c vmlinux (32 bit)
generic:        5543880  481232  626688 6651800  657f98 vmlinux (32 bit)
x86-specific:   6111834  846568  724424 7682826  753b0a vmlinux (64 bit)
generic:        6111882  846568  724424 7682874  753b3a vmlinux (64 bit)

(and I double-checked the i386 results)

> (the latter only available in x86.git)
> 
> > Patch is against -x86#testing. It compiles.
> 
> i've picked it up into x86.git, lets see how it goes in practice.

Thanks,
    Alexander

> 	Ingo
-- 
  Alexander van Heukelum
  heukelum@fastmail.fm

-- 
http://www.fastmail.fm - And now for something completely different…


  parent reply	other threads:[~2008-03-09 21:13 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-09 20:01 [PATCH] x86: Change x86 to use generic find_next_bit Alexander van Heukelum
2008-03-09 20:10 ` Ingo Molnar
2008-03-09 21:03   ` Andi Kleen
2008-03-09 21:32     ` Andi Kleen
2008-03-09 21:13   ` Alexander van Heukelum [this message]
2008-03-10  6:29     ` Ingo Molnar
2008-03-09 20:11 ` Ingo Molnar
2008-03-09 20:31   ` Alexander van Heukelum
2008-03-09 20:51     ` Ingo Molnar
2008-03-09 21:29       ` Andi Kleen
2008-03-10 23:17       ` [RFC/PATCH] x86: Optimize find_next_(zero_)bit for small constant-size bitmaps Alexander van Heukelum
2008-03-11  9:56         ` Ingo Molnar
2008-03-11 15:17           ` [PATCH] " Alexander van Heukelum
2008-03-11 15:22             ` [RFC] non-x86: " Alexander van Heukelum
2008-03-11 15:23             ` [PATCH] x86: " Ingo Molnar
2008-03-09 20:28 ` [PATCH] x86: Change x86 to use generic find_next_bit Andi Kleen
2008-03-09 21:31 ` Andi Kleen
2008-03-13 12:44 ` Aneesh Kumar K.V
2008-03-13 14:27   ` Alexander van Heukelum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1205097195.13205.1241421773@webmail.messagingengine.com \
    --to=heukelum@fastmail.fm \
    --cc=heukelum@mailshack.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.