From: Yury Norov <yury.norov@gmail.com>
To: cp0613@linux.alibaba.com
Cc: alex@ghiti.fr, aou@eecs.berkeley.edu, arnd@arndb.de,
linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-riscv@lists.infradead.org, linux@rasmusvillemoes.dk,
palmer@dabbelt.com, paul.walmsley@sifive.com
Subject: Re: [PATCH 2/2] bitops: rotate: Add riscv implementation using Zbb extension
Date: Sat, 28 Jun 2025 21:48:02 -0400 [thread overview]
Message-ID: <aGCbRguHwFY372Ut@yury> (raw)
In-Reply-To: <20250628111357.1627-1-cp0613@linux.alibaba.com>
On Sat, Jun 28, 2025 at 07:13:57PM +0800, cp0613@linux.alibaba.com wrote:
> On Fri, 20 Jun 2025 12:20:47 -0400, yury.norov@gmail.com wrote:
>
> > Can you add a comment about what is happening here? Are you sure it's
> > optimized out in case of the 'legacy' alternative?
>
> Thank you for your review. Yes, I referred to the existing variable__fls()
> implementation, which should be fine.
No, it's not fine. Because you trimmed your original email completely,
so there's no way to understand what I'm asking about; and because you
didn't answer my question. So I'll ask again: what exactly you are doing
in the line you've trimmed out?
> > Here you wire ror/rol() to the variable_ror/rol() unconditionally, and
> > that breaks compile-time rotation if the parameter is known at compile
> > time.
> >
> > I believe, generic implementation will allow compiler to handle this
> > case better. Can you do a similar thing to what fls() does in the same
> > file?
>
> I did consider it, but I did not find any toolchain that provides an
> implementation similar to __builtin_ror or __builtin_rol. If there is one,
> please help point it out.
This is the example of the toolchain you're looking for:
/**
* rol64 - rotate a 64-bit value left
* @word: value to rotate
* @shift: bits to roll
*/
static inline __u64 rol64(__u64 word, unsigned int shift)
{
return (word << (shift & 63)) | (word >> ((-shift) & 63));
}
What I'm asking is: please show me that compile-time rol/ror is still
calculated at compile time, i.e. ror64(1234, 12) is evaluated at
compile time.
> In addition, I did not consider it carefully before. If the rotate function
> is to be genericized, all archneed to include <asm-generic/bitops/rotate.h>.
> I missed this step.
Sorry, I'm lost here about what you've considered and what not. I'm OK
about accelerating ror/rol, but I want to make sure that;
1. The most trivial compile-case is actually evaluated at compile time; and
2. Any arch-specific code is well explained; and
3. legacy case optimized just as well as non-legacy.
Thanks,
Yury
WARNING: multiple messages have this Message-ID (diff)
From: Yury Norov <yury.norov@gmail.com>
To: cp0613@linux.alibaba.com
Cc: alex@ghiti.fr, aou@eecs.berkeley.edu, arnd@arndb.de,
linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-riscv@lists.infradead.org, linux@rasmusvillemoes.dk,
palmer@dabbelt.com, paul.walmsley@sifive.com
Subject: Re: [PATCH 2/2] bitops: rotate: Add riscv implementation using Zbb extension
Date: Sat, 28 Jun 2025 21:48:02 -0400 [thread overview]
Message-ID: <aGCbRguHwFY372Ut@yury> (raw)
In-Reply-To: <20250628111357.1627-1-cp0613@linux.alibaba.com>
On Sat, Jun 28, 2025 at 07:13:57PM +0800, cp0613@linux.alibaba.com wrote:
> On Fri, 20 Jun 2025 12:20:47 -0400, yury.norov@gmail.com wrote:
>
> > Can you add a comment about what is happening here? Are you sure it's
> > optimized out in case of the 'legacy' alternative?
>
> Thank you for your review. Yes, I referred to the existing variable__fls()
> implementation, which should be fine.
No, it's not fine. Because you trimmed your original email completely,
so there's no way to understand what I'm asking about; and because you
didn't answer my question. So I'll ask again: what exactly you are doing
in the line you've trimmed out?
> > Here you wire ror/rol() to the variable_ror/rol() unconditionally, and
> > that breaks compile-time rotation if the parameter is known at compile
> > time.
> >
> > I believe, generic implementation will allow compiler to handle this
> > case better. Can you do a similar thing to what fls() does in the same
> > file?
>
> I did consider it, but I did not find any toolchain that provides an
> implementation similar to __builtin_ror or __builtin_rol. If there is one,
> please help point it out.
This is the example of the toolchain you're looking for:
/**
* rol64 - rotate a 64-bit value left
* @word: value to rotate
* @shift: bits to roll
*/
static inline __u64 rol64(__u64 word, unsigned int shift)
{
return (word << (shift & 63)) | (word >> ((-shift) & 63));
}
What I'm asking is: please show me that compile-time rol/ror is still
calculated at compile time, i.e. ror64(1234, 12) is evaluated at
compile time.
> In addition, I did not consider it carefully before. If the rotate function
> is to be genericized, all archneed to include <asm-generic/bitops/rotate.h>.
> I missed this step.
Sorry, I'm lost here about what you've considered and what not. I'm OK
about accelerating ror/rol, but I want to make sure that;
1. The most trivial compile-case is actually evaluated at compile time; and
2. Any arch-specific code is well explained; and
3. legacy case optimized just as well as non-legacy.
Thanks,
Yury
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2025-06-29 1:48 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-20 11:16 [PATCH 0/2] Implementing bitops rotate using riscv Zbb extension cp0613
2025-06-20 11:16 ` cp0613
2025-06-20 11:16 ` [PATCH 1/2] bitops: generic rotate cp0613
2025-06-20 11:16 ` cp0613
2025-06-20 15:47 ` kernel test robot
2025-06-20 15:47 ` kernel test robot
2025-06-23 11:59 ` kernel test robot
2025-06-23 11:59 ` kernel test robot
2025-06-20 11:16 ` [PATCH 2/2] bitops: rotate: Add riscv implementation using Zbb extension cp0613
2025-06-20 11:16 ` cp0613
2025-06-20 16:20 ` Yury Norov
2025-06-20 16:20 ` Yury Norov
2025-06-25 16:02 ` David Laight
2025-06-25 16:02 ` David Laight
2025-06-28 12:08 ` cp0613
2025-06-28 12:08 ` cp0613
2025-06-29 10:38 ` David Laight
2025-06-29 10:38 ` David Laight
2025-06-30 12:14 ` cp0613
2025-06-30 12:14 ` cp0613
2025-06-30 17:35 ` David Laight
2025-06-30 17:35 ` David Laight
2025-07-01 13:01 ` cp0613
2025-07-01 13:01 ` cp0613
2025-06-28 11:13 ` cp0613
2025-06-28 11:13 ` cp0613
2025-06-29 1:48 ` Yury Norov [this message]
2025-06-29 1:48 ` Yury Norov
2025-06-30 12:04 ` cp0613
2025-06-30 12:04 ` cp0613
2025-06-30 16:53 ` Yury Norov
2025-06-30 16:53 ` Yury Norov
2025-07-01 12:47 ` cp0613
2025-07-01 12:47 ` cp0613
2025-07-01 18:32 ` Yury Norov
2025-07-01 18:32 ` Yury Norov
2025-07-02 10:11 ` David Laight
2025-07-02 10:11 ` David Laight
2025-07-03 16:58 ` Yury Norov
2025-07-03 16:58 ` Yury Norov
2025-07-02 12:30 ` cp0613
2025-07-02 12:30 ` cp0613
-- strict thread matches above, loose matches on Subject: below --
2025-06-20 17:40 kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aGCbRguHwFY372Ut@yury \
--to=yury.norov@gmail.com \
--cc=alex@ghiti.fr \
--cc=aou@eecs.berkeley.edu \
--cc=arnd@arndb.de \
--cc=cp0613@linux.alibaba.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux@rasmusvillemoes.dk \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.