From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Yury Norov <yury.norov@gmail.com>,
Dennis Zhou <dennis@kernel.org>,
Guenter Roeck <linux@roeck-us.net>,
Catalin Marinas <catalin.marinas@arm.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Geert Uytterhoeven <geert@linux-m68k.org>,
linux-m68k@lists.linux-m68k.org
Subject: Re: Linux 5.19-rc8
Date: Tue, 26 Jul 2022 20:44:34 +0100 [thread overview]
Message-ID: <YuBEIiLL1xZVyEFl@shell.armlinux.org.uk> (raw)
In-Reply-To: <CAHk-=wg2-j8zocUjurAeg_bimNz7C5h5HDEXKK6PxDmR+DaHRg@mail.gmail.com>
On Tue, Jul 26, 2022 at 11:36:21AM -0700, Linus Torvalds wrote:
> On Tue, Jul 26, 2022 at 11:18 AM Yury Norov <yury.norov@gmail.com> wrote:
> >
> > We have find_bit_benchmark to check how it works in practice. Would
> > be great if someone with access to the hardware can share numbers.
>
> Honestly, I doubt benchmarking find_bit in a loop is all that sensible.
Yes, that's what I was thinking - I've never seen it crop up in any of
the perf traces I've seen.
Nevertheless, here's some numbers from a single run of the
find_bit_benchmark module, kernel built with:
arm-linux-gnueabihf-gcc (Debian 10.2.1-6) 10.2.1 20210110
Current native implementation:
[ 46.184565]
Start testing find_bit() with random-filled bitmap
[ 46.195127] find_next_bit: 2440833 ns, 163112 iterations
[ 46.204226] find_next_zero_bit: 2372128 ns, 164569 iterations
[ 46.213152] find_last_bit: 2199779 ns, 163112 iterations
[ 46.299398] find_first_bit: 79526013 ns, 16234 iterations
[ 46.684026] find_first_and_bit: 377912990 ns, 32617 iterations
[ 46.692020] find_next_and_bit: 1269071 ns, 73562 iterations
[ 46.698745]
Start testing find_bit() with sparse bitmap
[ 46.705711] find_next_bit: 118652 ns, 656 iterations
[ 46.716621] find_next_zero_bit: 4183472 ns, 327025 iterations
[ 46.723395] find_last_bit: 50448 ns, 656 iterations
[ 46.762308] find_first_bit: 32190802 ns, 656 iterations
[ 46.769093] find_first_and_bit: 52129 ns, 1 iterations
[ 46.775882] find_next_and_bit: 62522 ns, 1 iterations
Generic implementation:
[ 25.149238]
Start testing find_bit() with random-filled bitmap
[ 25.160002] find_next_bit: 2640943 ns, 163537 iterations
[ 25.169567] find_next_zero_bit: 2838485 ns, 164144 iterations
[ 25.178595] find_last_bit: 2302372 ns, 163538 iterations
[ 25.204016] find_first_bit: 18697630 ns, 16373 iterations
[ 25.602571] find_first_and_bit: 391841480 ns, 32555 iterations
[ 25.610563] find_next_and_bit: 1260306 ns, 73587 iterations
[ 25.617295]
Start testing find_bit() with sparse bitmap
[ 25.624222] find_next_bit: 70289 ns, 656 iterations
[ 25.636478] find_next_zero_bit: 5527050 ns, 327025 iterations
[ 25.643253] find_last_bit: 52147 ns, 656 iterations
[ 25.657304] find_first_bit: 7328573 ns, 656 iterations
[ 25.664087] find_first_and_bit: 48518 ns, 1 iterations
[ 25.670871] find_next_and_bit: 59750 ns, 1 iterations
Overall, I would say it's pretty similar (some generic perform
marginally better, some native perform marginally better) with the
exception of find_first_bit() being much better with the generic
implementation, but find_next_zero_bit() being noticably worse.
So, pretty much nothing of any relevance between them, which may
come as a surprise given the byte vs word access differences between
the two implementations.
I suspect the reason behind that may be because the native
implementation code is smaller than the generic implementation,
outweighing the effects of the by-byte rather than by-word. I would
also suspect that, because of the smaller implementation, the native
version performs better in a I$-cool situation than the generic. Lastly,
I would suspect if we fixed the bug in the native version, and converted
it to use word loads, it would probably be better than the generic
version. I haven't anything to base that on other than gut feeling at
the moment, but I can make the changes to the native implementation and
see what effect that has, possibly tomorrow.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
next prev parent reply other threads:[~2022-07-26 19:44 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-24 20:42 Linux 5.19-rc8 Linus Torvalds
2022-07-25 16:11 ` Guenter Roeck
2022-07-25 17:55 ` Linus Torvalds
2022-07-25 18:49 ` Linus Torvalds
2022-07-25 20:35 ` Yury Norov
2022-07-25 20:40 ` Linus Torvalds
2022-07-26 15:51 ` Yury Norov
2022-07-25 19:41 ` Yury Norov
2022-07-26 9:12 ` Russell King (Oracle)
2022-07-26 15:35 ` Yury Norov
2022-07-28 18:28 ` Russell King (Oracle)
2022-07-29 0:11 ` Guenter Roeck
2022-07-26 17:39 ` Dennis Zhou
2022-07-26 17:51 ` Linus Torvalds
2022-07-26 18:18 ` Yury Norov
2022-07-26 18:36 ` Linus Torvalds
2022-07-26 19:44 ` Russell King (Oracle) [this message]
2022-07-26 20:20 ` Linus Torvalds
2022-07-27 0:15 ` Russell King (Oracle)
2022-07-27 1:33 ` Yury Norov
2022-07-27 7:43 ` Russell King (Oracle)
2022-07-30 21:38 ` Yury Norov
2022-08-01 15:48 ` Russell King (Oracle)
2022-08-01 15:54 ` Russell King (Oracle)
2022-07-27 7:46 ` David Laight
2022-07-25 20:34 ` Build regressions/improvements in v5.19-rc8 Geert Uytterhoeven
2022-07-25 20:39 ` Geert Uytterhoeven
2022-07-25 20:39 ` Geert Uytterhoeven
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YuBEIiLL1xZVyEFl@shell.armlinux.org.uk \
--to=linux@armlinux.org.uk \
--cc=catalin.marinas@arm.com \
--cc=dennis@kernel.org \
--cc=geert@linux-m68k.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-m68k@lists.linux-m68k.org \
--cc=linux@roeck-us.net \
--cc=torvalds@linux-foundation.org \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.