From: Yury Norov <yury.norov@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>,
Dennis Zhou <dennis@kernel.org>,
Russell King - ARM Linux <linux@armlinux.org.uk>,
Catalin Marinas <catalin.marinas@arm.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Linux 5.19-rc8
Date: Mon, 25 Jul 2022 13:35:48 -0700 [thread overview]
Message-ID: <Yt7+pPyEmYYH8D1K@yury-laptop> (raw)
In-Reply-To: <CAHk-=wgGrewyOqT7Cm-eHKp5W+8rJ=aL8iNtsbhRfc0YD2gbeA@mail.gmail.com>
On Mon, Jul 25, 2022 at 11:49:09AM -0700, Linus Torvalds wrote:
> On Mon, Jul 25, 2022 at 10:55 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > I think the fix might be something like this:
>
> Hmm. Maybe the fix is to just not have the arm architecture-specific
> version at all.
I agree (see my other email in the thread). If no objections from ARM
people, I can drop it.
> The generic code handles the "small constant size bitmap that fits in
> a word" case better than the ARM special case code does.
>
> And the generic code handles the "scan large bitmap" case better than
> the ARM code does too, in that it does things a word at a time, while
> the ARM special case code does things one byte at a time.
>
> The ARM code does have a few things going for it:
>
> (a) it's simple
>
> (b) it has separate routines for the little-endian case
>
> Now, (a) is probably not too strong an argument, because it's arguably
> *too* simple, and buggy as a result. And having looked a bit more,
> it's not just _find_next_bit_le() that has this bug, it's all the
> "next" versions (zero-bit and big-endian).
>
> But (b) is actively better than what the generic "find bit" code has.
> The generic code is kind of disgusting in this area, with code like
>
> if (le)
> tmp = swab(tmp);
The patch that adds this is: b78c57135d470 ("lib/find_bit.c: join
_find_next_bit{_le}"), so I did that on purpose.
> in lib/find_bit.c and this is nasty for two reasons:
>
> (1) on little-endian, the "le" flag is mis-named: it's always zero,
> and it never should swab, but the code was taken from some big-endian
> generic case
Yes, the "le" is a bad name, and I think it should be changed. Are you OK
with "need_swab"?
> (2) even on big-endian, that "le" flag is basically a compile-time
> constant, but the header file shenanigans have hidden that fact, so it
> ends up being a "pass a constant to a function that then has to test
> it dynamically" situation
I think it's not measurable, at least find_bit_benchmark didn't get worse.
Even if find_next_bit() is invoked many times in a loop, we can expect
that branch predictor would optimize the difference out.
> So the generic code is in many ways better than the ARM special case
> code, but it has a couple of unfortunate warts too. At least those
> unfortunate warts aren't outright *bugs*, they are just ugly.
Here we have 2 ugly options - having pairs of almost identical
functions, or passing dummy variables. I decided that copy-pasting is
worse than abusing branch predictor.
Thanks,
Yury
next prev parent reply other threads:[~2022-07-25 20:35 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-24 20:42 Linux 5.19-rc8 Linus Torvalds
2022-07-25 16:11 ` Guenter Roeck
2022-07-25 17:55 ` Linus Torvalds
2022-07-25 18:49 ` Linus Torvalds
2022-07-25 20:35 ` Yury Norov [this message]
2022-07-25 20:40 ` Linus Torvalds
2022-07-26 15:51 ` Yury Norov
2022-07-25 19:41 ` Yury Norov
2022-07-26 9:12 ` Russell King (Oracle)
2022-07-26 15:35 ` Yury Norov
2022-07-28 18:28 ` Russell King (Oracle)
2022-07-29 0:11 ` Guenter Roeck
2022-07-26 17:39 ` Dennis Zhou
2022-07-26 17:51 ` Linus Torvalds
2022-07-26 18:18 ` Yury Norov
2022-07-26 18:36 ` Linus Torvalds
2022-07-26 19:44 ` Russell King (Oracle)
2022-07-26 20:20 ` Linus Torvalds
2022-07-27 0:15 ` Russell King (Oracle)
2022-07-27 1:33 ` Yury Norov
2022-07-27 7:43 ` Russell King (Oracle)
2022-07-30 21:38 ` Yury Norov
2022-08-01 15:48 ` Russell King (Oracle)
2022-08-01 15:54 ` Russell King (Oracle)
2022-07-27 7:46 ` David Laight
2022-07-25 20:34 ` Build regressions/improvements in v5.19-rc8 Geert Uytterhoeven
2022-07-25 20:39 ` Geert Uytterhoeven
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yt7+pPyEmYYH8D1K@yury-laptop \
--to=yury.norov@gmail.com \
--cc=catalin.marinas@arm.com \
--cc=dennis@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=linux@roeck-us.net \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox