All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yury Norov <yury.norov@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>,
	Dennis Zhou <dennis@kernel.org>,
	Russell King - ARM Linux <linux@armlinux.org.uk>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Linux 5.19-rc8
Date: Mon, 25 Jul 2022 13:35:48 -0700	[thread overview]
Message-ID: <Yt7+pPyEmYYH8D1K@yury-laptop> (raw)
In-Reply-To: <CAHk-=wgGrewyOqT7Cm-eHKp5W+8rJ=aL8iNtsbhRfc0YD2gbeA@mail.gmail.com>

On Mon, Jul 25, 2022 at 11:49:09AM -0700, Linus Torvalds wrote:
> On Mon, Jul 25, 2022 at 10:55 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > I think the fix might be something like this:
> 
> Hmm. Maybe the fix is to just not have the arm architecture-specific
> version at all.

I agree (see my other email in the thread). If no objections from ARM
people, I can drop it.

> The generic code handles the "small constant size bitmap that fits in
> a word" case better than the ARM special case code does.
> 
> And the generic code handles the "scan large bitmap" case better than
> the ARM code does too, in that it does things a word at a time, while
> the ARM special case code does things one byte at a time.
> 
> The ARM code does have a few things going for it:
> 
>  (a) it's simple
> 
>  (b) it has separate routines for the little-endian case
> 
> Now, (a) is probably not too strong an argument, because it's arguably
> *too* simple, and buggy as a result. And having looked a bit more,
> it's not just _find_next_bit_le() that has this bug, it's all the
> "next" versions (zero-bit and big-endian).
> 
> But (b) is actively better than what the generic "find bit" code has.
> The generic code is kind of disgusting in this area, with code like
> 
>         if (le)
>                 tmp = swab(tmp);

The patch that adds this is: b78c57135d470 ("lib/find_bit.c: join
_find_next_bit{_le}"), so I did that on purpose.

> in lib/find_bit.c and this is nasty for two reasons:
> 
>  (1) on little-endian, the "le" flag is mis-named: it's always zero,
> and it never should swab, but the code was taken from some big-endian
> generic case

Yes, the "le" is a bad name, and I think it should be changed. Are you OK
with "need_swab"?
 
>  (2) even on big-endian, that "le" flag is basically a compile-time
> constant, but the header file shenanigans have hidden that fact, so it
> ends up being a "pass a constant to a function that then has to test
> it dynamically" situation

I think it's not measurable, at least find_bit_benchmark didn't get worse.
Even if find_next_bit() is invoked many times in a loop, we can expect
that branch predictor would optimize the difference out.
 
> So the generic code is in many ways better than the ARM special case
> code, but it has a couple of unfortunate warts too. At least those
> unfortunate warts aren't outright *bugs*, they are just ugly.

Here we have 2 ugly options - having pairs of almost identical
functions, or passing dummy variables. I decided that copy-pasting is
worse than abusing branch predictor.

Thanks,
Yury

  reply	other threads:[~2022-07-25 20:35 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-24 20:42 Linux 5.19-rc8 Linus Torvalds
2022-07-25 16:11 ` Guenter Roeck
2022-07-25 17:55   ` Linus Torvalds
2022-07-25 18:49     ` Linus Torvalds
2022-07-25 20:35       ` Yury Norov [this message]
2022-07-25 20:40         ` Linus Torvalds
2022-07-26 15:51           ` Yury Norov
2022-07-25 19:41     ` Yury Norov
2022-07-26  9:12     ` Russell King (Oracle)
2022-07-26 15:35       ` Yury Norov
2022-07-28 18:28       ` Russell King (Oracle)
2022-07-29  0:11         ` Guenter Roeck
2022-07-26 17:39     ` Dennis Zhou
2022-07-26 17:51       ` Linus Torvalds
2022-07-26 18:18         ` Yury Norov
2022-07-26 18:36           ` Linus Torvalds
2022-07-26 19:44             ` Russell King (Oracle)
2022-07-26 20:20               ` Linus Torvalds
2022-07-27  0:15                 ` Russell King (Oracle)
2022-07-27  1:33                   ` Yury Norov
2022-07-27  7:43                     ` Russell King (Oracle)
2022-07-30 21:38                       ` Yury Norov
2022-08-01 15:48                         ` Russell King (Oracle)
2022-08-01 15:54                           ` Russell King (Oracle)
2022-07-27  7:46                     ` David Laight
2022-07-25 20:34 ` Build regressions/improvements in v5.19-rc8 Geert Uytterhoeven
2022-07-25 20:39   ` Geert Uytterhoeven
2022-07-25 20:39     ` Geert Uytterhoeven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yt7+pPyEmYYH8D1K@yury-laptop \
    --to=yury.norov@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=dennis@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=linux@roeck-us.net \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.