From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Yury Norov <yury.norov@gmail.com>,
Dennis Zhou <dennis@kernel.org>,
Guenter Roeck <linux@roeck-us.net>,
Catalin Marinas <catalin.marinas@arm.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Geert Uytterhoeven <geert@linux-m68k.org>,
linux-m68k@lists.linux-m68k.org
Subject: Re: Linux 5.19-rc8
Date: Wed, 27 Jul 2022 01:15:45 +0100 [thread overview]
Message-ID: <YuCDscyJotkjNQcH@shell.armlinux.org.uk> (raw)
In-Reply-To: <CAHk-=wjpYLLoi1m0VRfVoyzGgmMiNwBhQ0XXG0VWwjskcz5Cug@mail.gmail.com>
On Tue, Jul 26, 2022 at 01:20:23PM -0700, Linus Torvalds wrote:
> On Tue, Jul 26, 2022 at 12:44 PM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > Overall, I would say it's pretty similar (some generic perform
> > marginally better, some native perform marginally better) with the
> > exception of find_first_bit() being much better with the generic
> > implementation, but find_next_zero_bit() being noticably worse.
>
> The generic _find_first_bit() code is actually sane and simple. It
> loops over words until it finds a non-zero one, and then does trivial
> calculations on that last word.
>
> That explains why the generic code does so much better than your byte-wise asm.
>
> In contrast, the generic _find_next_bit() I find almost offensively
> silly - which in turn explains why your byte-wide asm does better.
>
> I think the generic _find_next_bit() should actually do what the m68k
> find_next_bit code does: handle the first special word itself, and
> then just call find_first_bit() on the rest of it.
>
> And it should *not* try to handle the dynamic "bswap and/or bit sense
> invert" thing at all. That should be just four different (trivial)
> cases for the first word.
Here's the results for the native version converted to use word loads:
[ 37.319937]
Start testing find_bit() with random-filled bitmap
[ 37.330289] find_next_bit: 2222703 ns, 163781 iterations
[ 37.339186] find_next_zero_bit: 2154375 ns, 163900 iterations
[ 37.348118] find_last_bit: 2208104 ns, 163780 iterations
[ 37.372564] find_first_bit: 17722203 ns, 16370 iterations
[ 37.737415] find_first_and_bit: 358135191 ns, 32453 iterations
[ 37.745420] find_next_and_bit: 1280537 ns, 73644 iterations
[ 37.752143]
Start testing find_bit() with sparse bitmap
[ 37.759032] find_next_bit: 41256 ns, 655 iterations
[ 37.769905] find_next_zero_bit: 4148410 ns, 327026 iterations
[ 37.776675] find_last_bit: 48742 ns, 655 iterations
[ 37.790961] find_first_bit: 7562371 ns, 655 iterations
[ 37.797743] find_first_and_bit: 47366 ns, 1 iterations
[ 37.804527] find_next_and_bit: 59924 ns, 1 iterations
which is generally faster than the generic version, with the exception
of the sparse find_first_bit (generic was:
[ 25.657304] find_first_bit: 7328573 ns, 656 iterations)
find_next_{,zero_}bit() in the sparse case are quite a bit faster than
the generic code.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
next prev parent reply other threads:[~2022-07-27 0:16 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-24 20:42 Linux 5.19-rc8 Linus Torvalds
2022-07-25 16:11 ` Guenter Roeck
2022-07-25 17:55 ` Linus Torvalds
2022-07-25 18:49 ` Linus Torvalds
2022-07-25 20:35 ` Yury Norov
2022-07-25 20:40 ` Linus Torvalds
2022-07-26 15:51 ` Yury Norov
2022-07-25 19:41 ` Yury Norov
2022-07-26 9:12 ` Russell King (Oracle)
2022-07-26 15:35 ` Yury Norov
2022-07-28 18:28 ` Russell King (Oracle)
2022-07-29 0:11 ` Guenter Roeck
2022-07-26 17:39 ` Dennis Zhou
2022-07-26 17:51 ` Linus Torvalds
2022-07-26 18:18 ` Yury Norov
2022-07-26 18:36 ` Linus Torvalds
2022-07-26 19:44 ` Russell King (Oracle)
2022-07-26 20:20 ` Linus Torvalds
2022-07-27 0:15 ` Russell King (Oracle) [this message]
2022-07-27 1:33 ` Yury Norov
2022-07-27 7:43 ` Russell King (Oracle)
2022-07-30 21:38 ` Yury Norov
2022-08-01 15:48 ` Russell King (Oracle)
2022-08-01 15:54 ` Russell King (Oracle)
2022-07-27 7:46 ` David Laight
2022-07-25 20:34 ` Build regressions/improvements in v5.19-rc8 Geert Uytterhoeven
2022-07-25 20:39 ` Geert Uytterhoeven
2022-07-25 20:39 ` Geert Uytterhoeven
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YuCDscyJotkjNQcH@shell.armlinux.org.uk \
--to=linux@armlinux.org.uk \
--cc=catalin.marinas@arm.com \
--cc=dennis@kernel.org \
--cc=geert@linux-m68k.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-m68k@lists.linux-m68k.org \
--cc=linux@roeck-us.net \
--cc=torvalds@linux-foundation.org \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.