From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Yury Norov <yury.norov@gmail.com>,
Dennis Zhou <dennis@kernel.org>,
Guenter Roeck <linux@roeck-us.net>,
Catalin Marinas <catalin.marinas@arm.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Geert Uytterhoeven <geert@linux-m68k.org>,
linux-m68k@lists.linux-m68k.org
Subject: Re: Linux 5.19-rc8
Date: Wed, 27 Jul 2022 01:15:45 +0100 [thread overview]
Message-ID: <YuCDscyJotkjNQcH@shell.armlinux.org.uk> (raw)
In-Reply-To: <CAHk-=wjpYLLoi1m0VRfVoyzGgmMiNwBhQ0XXG0VWwjskcz5Cug@mail.gmail.com>
On Tue, Jul 26, 2022 at 01:20:23PM -0700, Linus Torvalds wrote:
> On Tue, Jul 26, 2022 at 12:44 PM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > Overall, I would say it's pretty similar (some generic perform
> > marginally better, some native perform marginally better) with the
> > exception of find_first_bit() being much better with the generic
> > implementation, but find_next_zero_bit() being noticably worse.
>
> The generic _find_first_bit() code is actually sane and simple. It
> loops over words until it finds a non-zero one, and then does trivial
> calculations on that last word.
>
> That explains why the generic code does so much better than your byte-wise asm.
>
> In contrast, the generic _find_next_bit() I find almost offensively
> silly - which in turn explains why your byte-wide asm does better.
>
> I think the generic _find_next_bit() should actually do what the m68k
> find_next_bit code does: handle the first special word itself, and
> then just call find_first_bit() on the rest of it.
>
> And it should *not* try to handle the dynamic "bswap and/or bit sense
> invert" thing at all. That should be just four different (trivial)
> cases for the first word.
Here's the results for the native version converted to use word loads:
[ 37.319937]
Start testing find_bit() with random-filled bitmap
[ 37.330289] find_next_bit: 2222703 ns, 163781 iterations
[ 37.339186] find_next_zero_bit: 2154375 ns, 163900 iterations
[ 37.348118] find_last_bit: 2208104 ns, 163780 iterations
[ 37.372564] find_first_bit: 17722203 ns, 16370 iterations
[ 37.737415] find_first_and_bit: 358135191 ns, 32453 iterations
[ 37.745420] find_next_and_bit: 1280537 ns, 73644 iterations
[ 37.752143]
Start testing find_bit() with sparse bitmap
[ 37.759032] find_next_bit: 41256 ns, 655 iterations
[ 37.769905] find_next_zero_bit: 4148410 ns, 327026 iterations
[ 37.776675] find_last_bit: 48742 ns, 655 iterations
[ 37.790961] find_first_bit: 7562371 ns, 655 iterations
[ 37.797743] find_first_and_bit: 47366 ns, 1 iterations
[ 37.804527] find_next_and_bit: 59924 ns, 1 iterations
which is generally faster than the generic version, with the exception
of the sparse find_first_bit (generic was:
[ 25.657304] find_first_bit: 7328573 ns, 656 iterations)
find_next_{,zero_}bit() in the sparse case are quite a bit faster than
the generic code.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
next prev parent reply other threads:[~2022-07-27 0:16 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-24 20:42 Linux 5.19-rc8 Linus Torvalds
2022-07-25 16:11 ` Guenter Roeck
2022-07-25 17:55 ` Linus Torvalds
2022-07-25 18:49 ` Linus Torvalds
2022-07-25 20:35 ` Yury Norov
2022-07-25 20:40 ` Linus Torvalds
2022-07-26 15:51 ` Yury Norov
2022-07-25 19:41 ` Yury Norov
2022-07-26 9:12 ` Russell King (Oracle)
2022-07-26 15:35 ` Yury Norov
2022-07-28 18:28 ` Russell King (Oracle)
2022-07-29 0:11 ` Guenter Roeck
2022-07-26 17:39 ` Dennis Zhou
2022-07-26 17:51 ` Linus Torvalds
2022-07-26 18:18 ` Yury Norov
2022-07-26 18:36 ` Linus Torvalds
2022-07-26 19:44 ` Russell King (Oracle)
2022-07-26 20:20 ` Linus Torvalds
2022-07-27 0:15 ` Russell King (Oracle) [this message]
2022-07-27 1:33 ` Yury Norov
2022-07-27 7:43 ` Russell King (Oracle)
2022-07-30 21:38 ` Yury Norov
2022-08-01 15:48 ` Russell King (Oracle)
2022-08-01 15:54 ` Russell King (Oracle)
2022-07-27 7:46 ` David Laight
2022-07-25 20:34 ` Build regressions/improvements in v5.19-rc8 Geert Uytterhoeven
2022-07-25 20:39 ` Geert Uytterhoeven
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YuCDscyJotkjNQcH@shell.armlinux.org.uk \
--to=linux@armlinux.org.uk \
--cc=catalin.marinas@arm.com \
--cc=dennis@kernel.org \
--cc=geert@linux-m68k.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-m68k@lists.linux-m68k.org \
--cc=linux@roeck-us.net \
--cc=torvalds@linux-foundation.org \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox