From: David Laight <david.laight.linux@gmail.com>
To: Yury Norov <yury.norov@gmail.com>
Cc: cp0613@linux.alibaba.com, linux@rasmusvillemoes.dk,
arnd@arndb.de, paul.walmsley@sifive.com, palmer@dabbelt.com,
aou@eecs.berkeley.edu, alex@ghiti.fr,
linux-riscv@lists.infradead.org, linux-arch@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] bitops: rotate: Add riscv implementation using Zbb extension
Date: Wed, 25 Jun 2025 17:02:34 +0100 [thread overview]
Message-ID: <20250625170234.29605eed@pumpkin> (raw)
In-Reply-To: <aFWKX4rpuNCDBP67@yury>
On Fri, 20 Jun 2025 12:20:47 -0400
Yury Norov <yury.norov@gmail.com> wrote:
> On Fri, Jun 20, 2025 at 07:16:10PM +0800, cp0613@linux.alibaba.com wrote:
> > From: Chen Pei <cp0613@linux.alibaba.com>
> >
> > The RISC-V Zbb extension[1] defines bitwise rotation instructions,
> > which can be used to implement rotate related functions.
> >
> > [1] https://github.com/riscv/riscv-bitmanip/
> >
> > Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
> > ---
> > arch/riscv/include/asm/bitops.h | 172 ++++++++++++++++++++++++++++++++
> > 1 file changed, 172 insertions(+)
> >
> > diff --git a/arch/riscv/include/asm/bitops.h b/arch/riscv/include/asm/bitops.h
> > index d59310f74c2b..be247ef9e686 100644
> > --- a/arch/riscv/include/asm/bitops.h
> > +++ b/arch/riscv/include/asm/bitops.h
> > @@ -20,17 +20,20 @@
> > #include <asm-generic/bitops/__fls.h>
> > #include <asm-generic/bitops/ffs.h>
> > #include <asm-generic/bitops/fls.h>
> > +#include <asm-generic/bitops/rotate.h>
> >
> > #else
> > #define __HAVE_ARCH___FFS
> > #define __HAVE_ARCH___FLS
> > #define __HAVE_ARCH_FFS
> > #define __HAVE_ARCH_FLS
> > +#define __HAVE_ARCH_ROTATE
> >
> > #include <asm-generic/bitops/__ffs.h>
> > #include <asm-generic/bitops/__fls.h>
> > #include <asm-generic/bitops/ffs.h>
> > #include <asm-generic/bitops/fls.h>
> > +#include <asm-generic/bitops/rotate.h>
> >
> > #include <asm/alternative-macros.h>
> > #include <asm/hwcap.h>
> > @@ -175,6 +178,175 @@ static __always_inline int variable_fls(unsigned int x)
> > variable_fls(x_); \
> > })
>
> ...
>
> > +static inline u8 variable_ror8(u8 word, unsigned int shift)
> > +{
> > + u32 word32 = ((u32)word << 24) | ((u32)word << 16) | ((u32)word << 8) | word;
>
> Can you add a comment about what is happening here? Are you sure it's
> optimized out in case of the 'legacy' alternative?
Is it even a gain in the zbb case?
The "rorw" is only ever going to help full word rotates.
Here you might as well do ((word << 8 | word) >> shift).
For "rol8" you'd need ((word << 24 | word) 'rol' shift).
I still bet the generic code is faster (but see below).
Same for 16bit rotates.
Actually the generic version is (probably) horrid for everything except x86.
See https://www.godbolt.org/z/xTxYj57To
unsigned char rol(unsigned char v, unsigned int shift)
{
return (v << 8 | v) << shift >> 8;
}
unsigned char ror(unsigned char v, unsigned int shift)
{
return (v << 8 | v) >> shift;
}
David
WARNING: multiple messages have this Message-ID (diff)
From: David Laight <david.laight.linux@gmail.com>
To: Yury Norov <yury.norov@gmail.com>
Cc: cp0613@linux.alibaba.com, linux@rasmusvillemoes.dk,
arnd@arndb.de, paul.walmsley@sifive.com, palmer@dabbelt.com,
aou@eecs.berkeley.edu, alex@ghiti.fr,
linux-riscv@lists.infradead.org, linux-arch@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] bitops: rotate: Add riscv implementation using Zbb extension
Date: Wed, 25 Jun 2025 17:02:34 +0100 [thread overview]
Message-ID: <20250625170234.29605eed@pumpkin> (raw)
In-Reply-To: <aFWKX4rpuNCDBP67@yury>
On Fri, 20 Jun 2025 12:20:47 -0400
Yury Norov <yury.norov@gmail.com> wrote:
> On Fri, Jun 20, 2025 at 07:16:10PM +0800, cp0613@linux.alibaba.com wrote:
> > From: Chen Pei <cp0613@linux.alibaba.com>
> >
> > The RISC-V Zbb extension[1] defines bitwise rotation instructions,
> > which can be used to implement rotate related functions.
> >
> > [1] https://github.com/riscv/riscv-bitmanip/
> >
> > Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
> > ---
> > arch/riscv/include/asm/bitops.h | 172 ++++++++++++++++++++++++++++++++
> > 1 file changed, 172 insertions(+)
> >
> > diff --git a/arch/riscv/include/asm/bitops.h b/arch/riscv/include/asm/bitops.h
> > index d59310f74c2b..be247ef9e686 100644
> > --- a/arch/riscv/include/asm/bitops.h
> > +++ b/arch/riscv/include/asm/bitops.h
> > @@ -20,17 +20,20 @@
> > #include <asm-generic/bitops/__fls.h>
> > #include <asm-generic/bitops/ffs.h>
> > #include <asm-generic/bitops/fls.h>
> > +#include <asm-generic/bitops/rotate.h>
> >
> > #else
> > #define __HAVE_ARCH___FFS
> > #define __HAVE_ARCH___FLS
> > #define __HAVE_ARCH_FFS
> > #define __HAVE_ARCH_FLS
> > +#define __HAVE_ARCH_ROTATE
> >
> > #include <asm-generic/bitops/__ffs.h>
> > #include <asm-generic/bitops/__fls.h>
> > #include <asm-generic/bitops/ffs.h>
> > #include <asm-generic/bitops/fls.h>
> > +#include <asm-generic/bitops/rotate.h>
> >
> > #include <asm/alternative-macros.h>
> > #include <asm/hwcap.h>
> > @@ -175,6 +178,175 @@ static __always_inline int variable_fls(unsigned int x)
> > variable_fls(x_); \
> > })
>
> ...
>
> > +static inline u8 variable_ror8(u8 word, unsigned int shift)
> > +{
> > + u32 word32 = ((u32)word << 24) | ((u32)word << 16) | ((u32)word << 8) | word;
>
> Can you add a comment about what is happening here? Are you sure it's
> optimized out in case of the 'legacy' alternative?
Is it even a gain in the zbb case?
The "rorw" is only ever going to help full word rotates.
Here you might as well do ((word << 8 | word) >> shift).
For "rol8" you'd need ((word << 24 | word) 'rol' shift).
I still bet the generic code is faster (but see below).
Same for 16bit rotates.
Actually the generic version is (probably) horrid for everything except x86.
See https://www.godbolt.org/z/xTxYj57To
unsigned char rol(unsigned char v, unsigned int shift)
{
return (v << 8 | v) << shift >> 8;
}
unsigned char ror(unsigned char v, unsigned int shift)
{
return (v << 8 | v) >> shift;
}
David
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2025-06-25 16:02 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-20 11:16 [PATCH 0/2] Implementing bitops rotate using riscv Zbb extension cp0613
2025-06-20 11:16 ` cp0613
2025-06-20 11:16 ` [PATCH 1/2] bitops: generic rotate cp0613
2025-06-20 11:16 ` cp0613
2025-06-20 15:47 ` kernel test robot
2025-06-20 15:47 ` kernel test robot
2025-06-23 11:59 ` kernel test robot
2025-06-23 11:59 ` kernel test robot
2025-06-20 11:16 ` [PATCH 2/2] bitops: rotate: Add riscv implementation using Zbb extension cp0613
2025-06-20 11:16 ` cp0613
2025-06-20 16:20 ` Yury Norov
2025-06-20 16:20 ` Yury Norov
2025-06-25 16:02 ` David Laight [this message]
2025-06-25 16:02 ` David Laight
2025-06-28 12:08 ` cp0613
2025-06-28 12:08 ` cp0613
2025-06-29 10:38 ` David Laight
2025-06-29 10:38 ` David Laight
2025-06-30 12:14 ` cp0613
2025-06-30 12:14 ` cp0613
2025-06-30 17:35 ` David Laight
2025-06-30 17:35 ` David Laight
2025-07-01 13:01 ` cp0613
2025-07-01 13:01 ` cp0613
2025-06-28 11:13 ` cp0613
2025-06-28 11:13 ` cp0613
2025-06-29 1:48 ` Yury Norov
2025-06-29 1:48 ` Yury Norov
2025-06-30 12:04 ` cp0613
2025-06-30 12:04 ` cp0613
2025-06-30 16:53 ` Yury Norov
2025-06-30 16:53 ` Yury Norov
2025-07-01 12:47 ` cp0613
2025-07-01 12:47 ` cp0613
2025-07-01 18:32 ` Yury Norov
2025-07-01 18:32 ` Yury Norov
2025-07-02 10:11 ` David Laight
2025-07-02 10:11 ` David Laight
2025-07-03 16:58 ` Yury Norov
2025-07-03 16:58 ` Yury Norov
2025-07-02 12:30 ` cp0613
2025-07-02 12:30 ` cp0613
-- strict thread matches above, loose matches on Subject: below --
2025-06-20 17:40 kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250625170234.29605eed@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=alex@ghiti.fr \
--cc=aou@eecs.berkeley.edu \
--cc=arnd@arndb.de \
--cc=cp0613@linux.alibaba.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux@rasmusvillemoes.dk \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.