From: Andi Kleen <andi@firstfloor.org>
To: Alexander van Heukelum <heukelum@fastmail.fm>
Cc: Ingo Molnar <mingo@elte.hu>, linux-kernel@vger.kernel.org
Subject: Re: [v2.6.26] what's brewing in x86.git for v2.6.26
Date: Thu, 17 Apr 2008 12:51:09 +0200 [thread overview]
Message-ID: <48072B9D.2000900@firstfloor.org> (raw)
In-Reply-To: <1208426793.10305.1248377703@webmail.messagingengine.com>
>
> The input for the first 'benchmark' was indeed completely unrealistic.
> They did show a very convincing speedup, though. This program was
> really written to verify the implementation and was later converted
> to a benchmark. Many benchmarks are unrealistic. I also wrote a
> benchmark for find_first_bit and find_next_bit:
> http://heukelum.fastmail.fm/find_first_bit
I think a realistic benchmark would be by running a real kernel
and profiling the input values of the bitmap functions and then
testing these cases.
I actually started that when I complained last time by writing
a systemtap script for this that generates a histogram, but for some
reason systemtap couldn't tap all bitmap functions in my kernel and
missed some completely and I ran out of time tracking that down.
My gut feeling is the only interesting cases are cpumask/nodemask sized
(which can be one word, two words but now upto 8 words on a NR_CPU=4096
x86 kernel) and then 4k sized ext3/reiser/etc. block bitmaps.
> My conclusion would be: the speed of the generic bitmap implementation
> is either better than or at least comparable to the current private
> implementations in i386/x86_64.
Ok.
The generic version is out-of-line,
> while the private implementation of i386 was inlined: this causes a
> regression for very small bitmaps. However, if the bitmap size is
> a constant and fits a long integer, the updated generic code should
> inline an optimized version, like x86_64 currently does it.
Yes it should probably. cpumask walks are relatively common.
I remember profiling mysql some time ago which did bad overscheduling
due to dumb locking. Funny was that the mask walking in the scheduler
actually stood out. No, i don't claim extreme overscheduling is an
interesting case to optimize for, but then there are more realistic
workloads which also do a lot of context switching.
BTW if you do generic work on this: one reason the generated code for
for_each_cpu etc. is so ugly is that the code has checks for
find_next_bit returning >= max size. If you can generize the
code enough to make sure no arch does that anymore these checks
could be eliminated.
-Andi
next prev parent reply other threads:[~2008-04-17 10:51 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-16 20:23 [v2.6.26] what's brewing in x86.git for v2.6.26 Ingo Molnar
2008-04-16 20:37 ` Roland Dreier
2008-04-16 22:18 ` Suresh Siddha
2008-04-16 20:50 ` Andi Kleen
2008-04-17 10:06 ` Alexander van Heukelum
2008-04-17 10:51 ` Andi Kleen [this message]
2008-04-17 13:33 ` Alexander van Heukelum
2008-04-18 8:38 ` Ingo Molnar
2008-04-18 10:51 ` Andi Kleen
2008-04-17 7:25 ` Andrew Morton
2008-04-17 7:45 ` Pekka Enberg
2008-04-17 8:20 ` Andrew Morton
2008-04-17 8:32 ` Pekka J Enberg
2008-04-17 8:34 ` Pekka Enberg
2008-04-17 8:40 ` Ingo Molnar
2008-04-17 8:42 ` Andrew Morton
2008-04-17 11:49 ` Christoph Hellwig
2008-04-17 11:56 ` Ingo Molnar
2008-04-17 18:01 ` Andrew Morton
2008-04-17 18:51 ` Ingo Molnar
2008-04-17 19:57 ` Andrew Morton
2008-04-17 20:18 ` Ingo Molnar
2008-04-18 9:33 ` Tomasz Kłoczko
2008-04-18 9:42 ` Ingo Molnar
2008-04-17 8:14 ` Andrew Morton
2008-04-17 8:57 ` Avi Kivity
2008-04-17 10:32 ` Johannes Weiner
2008-04-17 10:50 ` Andrew Morton
2008-04-17 11:49 ` Christoph Hellwig
2008-04-17 17:36 ` Andrew Morton
2008-04-17 8:30 ` Ingo Molnar
2008-04-17 8:40 ` Andrew Morton
2008-04-17 8:45 ` David Miller
2008-04-17 8:54 ` Andrew Morton
2008-04-17 8:56 ` Andrew Morton
2008-04-17 9:19 ` David Miller
2008-04-17 9:33 ` Andrew Morton
2008-04-17 9:06 ` Ingo Molnar
2008-04-17 9:18 ` Andrew Morton
2008-04-17 9:30 ` Ingo Molnar
2008-04-17 9:36 ` Andrew Morton
2008-04-17 9:46 ` Ingo Molnar
2008-04-17 10:06 ` Andrew Morton
2008-04-17 10:11 ` Andi Kleen
2008-04-17 10:18 ` Andrew Morton
2008-04-17 10:29 ` Andi Kleen
2008-04-17 10:19 ` Pekka Enberg
2008-04-17 10:33 ` Andrew Morton
2008-04-17 10:38 ` Ingo Molnar
2008-04-17 10:42 ` Pekka Enberg
2008-04-18 11:12 ` Nick Piggin
2008-04-17 14:01 ` Arjan van de Ven
2008-04-17 15:26 ` Ingo Molnar
2008-04-18 12:41 ` Ingo Molnar
2008-04-17 10:41 ` Pekka Enberg
2008-04-17 18:47 ` Vegard Nossum
2008-04-17 19:27 ` Ingo Molnar
2008-04-17 19:35 ` Ingo Molnar
2008-04-17 19:39 ` Vegard Nossum
2008-04-17 19:43 ` Andrew Morton
2008-04-17 20:39 ` Vegard Nossum
2008-04-17 20:55 ` Andrew Morton
2008-04-17 9:53 ` Andrew Morton
2008-04-17 7:48 ` Andrew Morton
2008-04-18 6:27 ` Andrew Morton
2008-04-18 6:38 ` David Miller
2008-04-18 7:47 ` Ingo Molnar
2008-04-18 8:00 ` Andrew Morton
2008-04-18 8:11 ` Christoph Hellwig
2008-04-18 8:18 ` David Miller
2008-04-18 12:48 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48072B9D.2000900@firstfloor.org \
--to=andi@firstfloor.org \
--cc=heukelum@fastmail.fm \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox