From: "Alexander van Heukelum" <heukelum@fastmail.fm>
To: "Ingo Molnar" <mingo@elte.hu>,
"Alexander van Heukelum" <heukelum@mailshack.com>
Cc: "Thomas Gleixner" <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>,
"LKML" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] x86: Change x86 to use generic find_next_bit
Date: Sun, 09 Mar 2008 22:13:15 +0100 [thread overview]
Message-ID: <1205097195.13205.1241421773@webmail.messagingengine.com> (raw)
In-Reply-To: <20080309201016.GA28454@elte.hu>
On Sun, 9 Mar 2008 21:10:16 +0100, "Ingo Molnar" <mingo@elte.hu> said:
> > Athlon Xeon Opteron 32/64bit
> > x86-specific: 0m3.692s 0m2.820s 0m3.196s / 0m2.480s
> > generic: 0m2.622s 0m1.662s 0m2.100s / 0m1.572s
>
> ok, that's rather convincing.
>
> the generic version in lib/find_next_bit.c is open-coded C which gcc can
> optimize pretty nicely.
>
> the hand-coded assembly versions in arch/x86/lib/bitops_32.c mostly use
> the special x86 'bit search forward' (BSF) instruction - which i know
> from the days when the scheduler relied on it has some non-trivial setup
> costs. So especially when there's _small_ bitmasks involved, it's more
> expensive.
Hi,
BSF is fine, it doesn't need any special setup. The problem is probably
that the old versions use find_first_bit and find_first_zero_bit,
which are also hand optimized versions... and they use "repe scasl/q".
That's another little project ;).
> > If the bitmap size is not a multiple of BITS_PER_LONG, and no set
> > (cleared) bit is found, find_next_bit (find_next_zero_bit) returns a
> > value outside of the range [0,size]. The generic version always
> > returns exactly size. The generic version also uses unsigned long
> > everywhere, while the x86 versions use a mishmash of int, unsigned
> > (int), long and unsigned long.
>
> i'm not surprised that the hand-coded assembly versions had a bug ...
Not surprised about the bug, but it was in fact noticed, and fixed
in x86_64!
> [ this means we have to test it quite carefully though, as lots of code
> only ever gets tested on x86 so code could have built dependency on
> the buggy behavior. ]
Agreed.
> > Using the generic version does give a slightly bigger kernel, though.
> >
> > defconfig: text data bss dec hex filename
> > x86-specific: 4738555 481232 626688 5846475 5935cb vmlinux (32 bit)
> > generic: 4738621 481232 626688 5846541 59360d vmlinux (32 bit)
> > x86-specific: 5392395 846568 724424 6963387 6a40bb vmlinux (64 bit)
> > generic: 5392458 846568 724424 6963450 6a40fa vmlinux (64 bit)
>
> i'd not worry about that too much. Have you tried to build with:
I don't but I needed to compile something to test the build anyhow ;)
> CONFIG_CC_OPTIMIZE_FOR_SIZE=y
> CONFIG_OPTIMIZE_INLINING=y
This was defconfig in -x86#testing, they were both already enabled.
Here is what you get with those options turned off ;).
text data bss dec hex filename
x86-specific: 5543996 481232 626688 6651916 65800c vmlinux (32 bit)
generic: 5543880 481232 626688 6651800 657f98 vmlinux (32 bit)
x86-specific: 6111834 846568 724424 7682826 753b0a vmlinux (64 bit)
generic: 6111882 846568 724424 7682874 753b3a vmlinux (64 bit)
(and I double-checked the i386 results)
> (the latter only available in x86.git)
>
> > Patch is against -x86#testing. It compiles.
>
> i've picked it up into x86.git, lets see how it goes in practice.
Thanks,
Alexander
> Ingo
--
Alexander van Heukelum
heukelum@fastmail.fm
--
http://www.fastmail.fm - And now for something completely different
next prev parent reply other threads:[~2008-03-09 21:13 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-09 20:01 [PATCH] x86: Change x86 to use generic find_next_bit Alexander van Heukelum
2008-03-09 20:10 ` Ingo Molnar
2008-03-09 21:03 ` Andi Kleen
2008-03-09 21:32 ` Andi Kleen
2008-03-09 21:13 ` Alexander van Heukelum [this message]
2008-03-10 6:29 ` Ingo Molnar
2008-03-09 20:11 ` Ingo Molnar
2008-03-09 20:31 ` Alexander van Heukelum
2008-03-09 20:51 ` Ingo Molnar
2008-03-09 21:29 ` Andi Kleen
2008-03-10 23:17 ` [RFC/PATCH] x86: Optimize find_next_(zero_)bit for small constant-size bitmaps Alexander van Heukelum
2008-03-11 9:56 ` Ingo Molnar
2008-03-11 15:17 ` [PATCH] " Alexander van Heukelum
2008-03-11 15:22 ` [RFC] non-x86: " Alexander van Heukelum
2008-03-11 15:23 ` [PATCH] x86: " Ingo Molnar
2008-03-09 20:28 ` [PATCH] x86: Change x86 to use generic find_next_bit Andi Kleen
2008-03-09 21:31 ` Andi Kleen
2008-03-13 12:44 ` Aneesh Kumar K.V
2008-03-13 14:27 ` Alexander van Heukelum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1205097195.13205.1241421773@webmail.messagingengine.com \
--to=heukelum@fastmail.fm \
--cc=heukelum@mailshack.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.