public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Yury Norov <yury.norov@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>,
	Dennis Zhou <dennis@kernel.org>,
	Russell King - ARM Linux <linux@armlinux.org.uk>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Linux 5.19-rc8
Date: Mon, 25 Jul 2022 13:35:48 -0700	[thread overview]
Message-ID: <Yt7+pPyEmYYH8D1K@yury-laptop> (raw)
In-Reply-To: <CAHk-=wgGrewyOqT7Cm-eHKp5W+8rJ=aL8iNtsbhRfc0YD2gbeA@mail.gmail.com>

On Mon, Jul 25, 2022 at 11:49:09AM -0700, Linus Torvalds wrote:
> On Mon, Jul 25, 2022 at 10:55 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > I think the fix might be something like this:
> 
> Hmm. Maybe the fix is to just not have the arm architecture-specific
> version at all.

I agree (see my other email in the thread). If no objections from ARM
people, I can drop it.

> The generic code handles the "small constant size bitmap that fits in
> a word" case better than the ARM special case code does.
> 
> And the generic code handles the "scan large bitmap" case better than
> the ARM code does too, in that it does things a word at a time, while
> the ARM special case code does things one byte at a time.
> 
> The ARM code does have a few things going for it:
> 
>  (a) it's simple
> 
>  (b) it has separate routines for the little-endian case
> 
> Now, (a) is probably not too strong an argument, because it's arguably
> *too* simple, and buggy as a result. And having looked a bit more,
> it's not just _find_next_bit_le() that has this bug, it's all the
> "next" versions (zero-bit and big-endian).
> 
> But (b) is actively better than what the generic "find bit" code has.
> The generic code is kind of disgusting in this area, with code like
> 
>         if (le)
>                 tmp = swab(tmp);

The patch that adds this is: b78c57135d470 ("lib/find_bit.c: join
_find_next_bit{_le}"), so I did that on purpose.

> in lib/find_bit.c and this is nasty for two reasons:
> 
>  (1) on little-endian, the "le" flag is mis-named: it's always zero,
> and it never should swab, but the code was taken from some big-endian
> generic case

Yes, the "le" is a bad name, and I think it should be changed. Are you OK
with "need_swab"?
 
>  (2) even on big-endian, that "le" flag is basically a compile-time
> constant, but the header file shenanigans have hidden that fact, so it
> ends up being a "pass a constant to a function that then has to test
> it dynamically" situation

I think it's not measurable, at least find_bit_benchmark didn't get worse.
Even if find_next_bit() is invoked many times in a loop, we can expect
that branch predictor would optimize the difference out.
 
> So the generic code is in many ways better than the ARM special case
> code, but it has a couple of unfortunate warts too. At least those
> unfortunate warts aren't outright *bugs*, they are just ugly.

Here we have 2 ugly options - having pairs of almost identical
functions, or passing dummy variables. I decided that copy-pasting is
worse than abusing branch predictor.

Thanks,
Yury

  reply	other threads:[~2022-07-25 20:35 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-24 20:42 Linux 5.19-rc8 Linus Torvalds
2022-07-25 16:11 ` Guenter Roeck
2022-07-25 17:55   ` Linus Torvalds
2022-07-25 18:49     ` Linus Torvalds
2022-07-25 20:35       ` Yury Norov [this message]
2022-07-25 20:40         ` Linus Torvalds
2022-07-26 15:51           ` Yury Norov
2022-07-25 19:41     ` Yury Norov
2022-07-26  9:12     ` Russell King (Oracle)
2022-07-26 15:35       ` Yury Norov
2022-07-28 18:28       ` Russell King (Oracle)
2022-07-29  0:11         ` Guenter Roeck
2022-07-26 17:39     ` Dennis Zhou
2022-07-26 17:51       ` Linus Torvalds
2022-07-26 18:18         ` Yury Norov
2022-07-26 18:36           ` Linus Torvalds
2022-07-26 19:44             ` Russell King (Oracle)
2022-07-26 20:20               ` Linus Torvalds
2022-07-27  0:15                 ` Russell King (Oracle)
2022-07-27  1:33                   ` Yury Norov
2022-07-27  7:43                     ` Russell King (Oracle)
2022-07-30 21:38                       ` Yury Norov
2022-08-01 15:48                         ` Russell King (Oracle)
2022-08-01 15:54                           ` Russell King (Oracle)
2022-07-27  7:46                     ` David Laight
2022-07-25 20:34 ` Build regressions/improvements in v5.19-rc8 Geert Uytterhoeven
2022-07-25 20:39   ` Geert Uytterhoeven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yt7+pPyEmYYH8D1K@yury-laptop \
    --to=yury.norov@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=dennis@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=linux@roeck-us.net \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox