[PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros()
@ 2026-03-12 23:08 Yury Norov
  2026-03-12 23:54 ` Enzo Matsumiya
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Yury Norov @ 2026-03-12 23:08 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Eric Biggers,
	Jason A. Donenfeld, Ard Biesheuvel
  Cc: Yury Norov, linux-kernel, kexec, linux-cifs, linux-spi,
	linux-hyperv, K. Y. Srinivasan, Haiyang Zhang, Jason Gunthorpe,
	Leon Romanovsky, Mark Brown, Steve French, Alexander Graf,
	Mike Rapoport, Pasha Tatashin

Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired
to ffs(), while in 64-bit case to __ffs(). The difference is substantial:
ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined.

The 32-bit behaviour is inconsistent with the function description, so it
needs to get fixed.

There are 9 individual users for the function in 6 different subsystems.
Some arches and drivers are 64-bit only:
 - arch/loongarch/kvm/intc/eiointc.c;
 - drivers/hv/mshv_vtl_main.c;
 - kernel/liveupdate/kexec_handover.c;

The others are:
 - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct;
 - rzv2m_csi_reg_write_bit(): ARCH_RENESAS only, unclear;
 - lz77_match_len(): CIFS_COMPRESSION only, unclear, experimental;

None of them explicitly tweak their code for a word length, or x == 0.

Requesting comments from the corresponding maintainers on how to proceed
with this.

The attached patch gets rid of 32-bit explicit support, so that both
32- and 64-bit versions rely on __ffs().

CC: "K. Y. Srinivasan" <kys@microsoft.com> (hyperv)
CC: Haiyang Zhang <haiyangz@microsoft.com> (hyperv)
CC: Jason Gunthorpe <jgg@ziepe.ca> (infiniband)
CC: Leon Romanovsky <leon@kernel.org> (infiniband)
CC: Mark Brown <broonie@kernel.org> (spi)
CC: Steve French <sfrench@samba.org> (smb)
CC: Alexander Graf <graf@amazon.com> (kexec)
CC: Mike Rapoport <rppt@kernel.org> (kexec)
CC: Pasha Tatashin <pasha.tatashin@soleen.com> (kexec)
Signed-off-by: Yury Norov <ynorov@nvidia.com>
---
 include/linux/count_zeros.h | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/include/linux/count_zeros.h b/include/linux/count_zeros.h
index 4e5680327ece..5034a30b5c7c 100644
--- a/include/linux/count_zeros.h
+++ b/include/linux/count_zeros.h
@@ -10,6 +10,8 @@
 
 #include <asm/bitops.h>
 
+#define COUNT_TRAILING_ZEROS_0 (-1)
+
 /**
  * count_leading_zeros - Count the number of zeros from the MSB back
  * @x: The value
@@ -40,12 +42,7 @@ static inline int count_leading_zeros(unsigned long x)
  */
 static inline int count_trailing_zeros(unsigned long x)
 {
-#define COUNT_TRAILING_ZEROS_0 (-1)
-
-	if (sizeof(x) == 4)
-		return ffs(x);
-	else
-		return (x != 0) ? __ffs(x) : COUNT_TRAILING_ZEROS_0;
+	return (x != 0) ? __ffs(x) : COUNT_TRAILING_ZEROS_0;
 }
 
 #endif /* _LINUX_BITOPS_COUNT_ZEROS_H_ */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros()
  2026-03-12 23:08 [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() Yury Norov
@ 2026-03-12 23:54 ` Enzo Matsumiya
  2026-03-13 16:31   ` Yury Norov
  2026-03-13  9:47 ` Andy Shevchenko
  2026-03-13 17:18 ` Jason Gunthorpe
  2 siblings, 1 reply; 10+ messages in thread
From: Enzo Matsumiya @ 2026-03-12 23:54 UTC (permalink / raw)
  To: Yury Norov
  Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Eric Biggers,
	Jason A. Donenfeld, Ard Biesheuvel, linux-kernel, kexec,
	linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan,
	Haiyang Zhang, Jason Gunthorpe, Leon Romanovsky, Mark Brown,
	Steve French, Alexander Graf, Mike Rapoport, Pasha Tatashin

Hi Yury,

On 03/12, Yury Norov wrote:
>Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired
>to ffs(), while in 64-bit case to __ffs(). The difference is substantial:
>ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined.
>
>The 32-bit behaviour is inconsistent with the function description, so it
>needs to get fixed.
>
>There are 9 individual users for the function in 6 different subsystems.
>Some arches and drivers are 64-bit only:
> - arch/loongarch/kvm/intc/eiointc.c;
> - drivers/hv/mshv_vtl_main.c;
> - kernel/liveupdate/kexec_handover.c;
>
>The others are:
> - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct;
> - rzv2m_csi_reg_write_bit(): ARCH_RENESAS only, unclear;
> - lz77_match_len(): CIFS_COMPRESSION only, unclear, experimental;
>
>None of them explicitly tweak their code for a word length, or x == 0.

Context for lz77_match_len() case:

	const u64 diff = lz77_read64(cur) ^ lz77_read64(wnd);

	if (!diff) {
	...
	}

	cur += count_trailing_zeros(diff) >> 3;

So x == 0 is checked, however it does assume that
sizeof(unsigned long) == sizeof(u64).  I'll have to fix it for when
that's not the case (even with your patch in, as __ffs() casts x to
unsigned long down the line).  Thanks for the heads up.


Cheers,

Enzo

>Requesting comments from the corresponding maintainers on how to proceed
>with this.
>
>The attached patch gets rid of 32-bit explicit support, so that both
>32- and 64-bit versions rely on __ffs().
>
>CC: "K. Y. Srinivasan" <kys@microsoft.com> (hyperv)
>CC: Haiyang Zhang <haiyangz@microsoft.com> (hyperv)
>CC: Jason Gunthorpe <jgg@ziepe.ca> (infiniband)
>CC: Leon Romanovsky <leon@kernel.org> (infiniband)
>CC: Mark Brown <broonie@kernel.org> (spi)
>CC: Steve French <sfrench@samba.org> (smb)
>CC: Alexander Graf <graf@amazon.com> (kexec)
>CC: Mike Rapoport <rppt@kernel.org> (kexec)
>CC: Pasha Tatashin <pasha.tatashin@soleen.com> (kexec)
>Signed-off-by: Yury Norov <ynorov@nvidia.com>
>---
> include/linux/count_zeros.h | 9 +++------
> 1 file changed, 3 insertions(+), 6 deletions(-)
>
>diff --git a/include/linux/count_zeros.h b/include/linux/count_zeros.h
>index 4e5680327ece..5034a30b5c7c 100644
>--- a/include/linux/count_zeros.h
>+++ b/include/linux/count_zeros.h
>@@ -10,6 +10,8 @@
>
> #include <asm/bitops.h>
>
>+#define COUNT_TRAILING_ZEROS_0 (-1)
>+
> /**
>  * count_leading_zeros - Count the number of zeros from the MSB back
>  * @x: The value
>@@ -40,12 +42,7 @@ static inline int count_leading_zeros(unsigned long x)
>  */
> static inline int count_trailing_zeros(unsigned long x)
> {
>-#define COUNT_TRAILING_ZEROS_0 (-1)
>-
>-	if (sizeof(x) == 4)
>-		return ffs(x);
>-	else
>-		return (x != 0) ? __ffs(x) : COUNT_TRAILING_ZEROS_0;
>+	return (x != 0) ? __ffs(x) : COUNT_TRAILING_ZEROS_0;
> }
>
> #endif /* _LINUX_BITOPS_COUNT_ZEROS_H_ */
>-- 
>2.43.0
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros()
  2026-03-12 23:54 ` Enzo Matsumiya
@ 2026-03-13 16:31   ` Yury Norov
  0 siblings, 0 replies; 10+ messages in thread
From: Yury Norov @ 2026-03-13 16:31 UTC (permalink / raw)
  To: Enzo Matsumiya
  Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Eric Biggers,
	Jason A. Donenfeld, Ard Biesheuvel, linux-kernel, kexec,
	linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan,
	Haiyang Zhang, Jason Gunthorpe, Leon Romanovsky, Mark Brown,
	Steve French, Alexander Graf, Mike Rapoport, Pasha Tatashin

On Thu, Mar 12, 2026 at 08:54:29PM -0300, Enzo Matsumiya wrote:
> Hi Yury,
> 
> On 03/12, Yury Norov wrote:
> > Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired
> > to ffs(), while in 64-bit case to __ffs(). The difference is substantial:
> > ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined.
> > 
> > The 32-bit behaviour is inconsistent with the function description, so it
> > needs to get fixed.
> > 
> > There are 9 individual users for the function in 6 different subsystems.
> > Some arches and drivers are 64-bit only:
> > - arch/loongarch/kvm/intc/eiointc.c;
> > - drivers/hv/mshv_vtl_main.c;
> > - kernel/liveupdate/kexec_handover.c;
> > 
> > The others are:
> > - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct;
> > - rzv2m_csi_reg_write_bit(): ARCH_RENESAS only, unclear;
> > - lz77_match_len(): CIFS_COMPRESSION only, unclear, experimental;
> > 
> > None of them explicitly tweak their code for a word length, or x == 0.
> 
> Context for lz77_match_len() case:
> 
> 	const u64 diff = lz77_read64(cur) ^ lz77_read64(wnd);
> 
> 	if (!diff) {
> 	...
> 	}
> 
> 	cur += count_trailing_zeros(diff) >> 3;
> 
> So x == 0 is checked, however it does assume that
> sizeof(unsigned long) == sizeof(u64).  I'll have to fix it for when
> that's not the case (even with your patch in, as __ffs() casts x to
> unsigned long down the line).  Thanks for the heads up.
 
Yes, in your case you need __ffs64() (which doesn't exist). Or simply
leverage bitmaps API:

        DECLARE_BITMAP(bitmap, 64);

        ...

        bitmap_from_u64(bitmap, diff);
        cur += find_first_bit(bitmap, 64) >> 3;
> 
> Cheers,
> 
> Enzo

So, if you're not objecting to wire 32-bit count_trailing_zeros() to
__ffs(), can you please send your ack?

Thanks,
Yury

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros()
  2026-03-12 23:08 [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() Yury Norov
  2026-03-12 23:54 ` Enzo Matsumiya
@ 2026-03-13  9:47 ` Andy Shevchenko
  2026-03-13 17:48   ` Yury Norov
  2026-03-13 17:18 ` Jason Gunthorpe
  2 siblings, 1 reply; 10+ messages in thread
From: Andy Shevchenko @ 2026-03-13  9:47 UTC (permalink / raw)
  To: Yury Norov
  Cc: Yury Norov, Rasmus Villemoes, Eric Biggers, Jason A. Donenfeld,
	Ard Biesheuvel, linux-kernel, kexec, linux-cifs, linux-spi,
	linux-hyperv, K. Y. Srinivasan, Haiyang Zhang, Jason Gunthorpe,
	Leon Romanovsky, Mark Brown, Steve French, Alexander Graf,
	Mike Rapoport, Pasha Tatashin

On Thu, Mar 12, 2026 at 07:08:16PM -0400, Yury Norov wrote:
> Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired
> to ffs(), while in 64-bit case to __ffs(). The difference is substantial:
> ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined.
> 
> The 32-bit behaviour is inconsistent with the function description, so it
> needs to get fixed.
> 
> There are 9 individual users for the function in 6 different subsystems.
> Some arches and drivers are 64-bit only:
>  - arch/loongarch/kvm/intc/eiointc.c;
>  - drivers/hv/mshv_vtl_main.c;
>  - kernel/liveupdate/kexec_handover.c;
> 
> The others are:
>  - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct;
>  - rzv2m_csi_reg_write_bit(): ARCH_RENESAS only, unclear;
>  - lz77_match_len(): CIFS_COMPRESSION only, unclear, experimental;
> 
> None of them explicitly tweak their code for a word length, or x == 0.
> 
> Requesting comments from the corresponding maintainers on how to proceed
> with this.
> 
> The attached patch gets rid of 32-bit explicit support, so that both
> 32- and 64-bit versions rely on __ffs().

> CC: "K. Y. Srinivasan" <kys@microsoft.com> (hyperv)
> CC: Haiyang Zhang <haiyangz@microsoft.com> (hyperv)
> CC: Jason Gunthorpe <jgg@ziepe.ca> (infiniband)
> CC: Leon Romanovsky <leon@kernel.org> (infiniband)
> CC: Mark Brown <broonie@kernel.org> (spi)
> CC: Steve French <sfrench@samba.org> (smb)
> CC: Alexander Graf <graf@amazon.com> (kexec)
> CC: Mike Rapoport <rppt@kernel.org> (kexec)
> CC: Pasha Tatashin <pasha.tatashin@soleen.com> (kexec)

Please, move the Cc: list to the...

> Signed-off-by: Yury Norov <ynorov@nvidia.com>
> ---

...comments block. It will have the same effect on emails, but drastically
reduces unneeded noise in the commit message in the Git history.

You may also read this subthread (patch 18) on how to handle it locally:
https://lore.kernel.org/linux-iio/20260123113708.416727-19-bigeasy@linutronix.de/

>  include/linux/count_zeros.h | 9 +++------

...

> +#define COUNT_TRAILING_ZEROS_0 (-1)

Shouldn't we also saturate this to BITS_PER_LONG?

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros()
  2026-03-13  9:47 ` Andy Shevchenko
@ 2026-03-13 17:48   ` Yury Norov
  2026-03-13 18:41     ` Konstantin Ryabitsev
  0 siblings, 1 reply; 10+ messages in thread
From: Yury Norov @ 2026-03-13 17:48 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Yury Norov, Rasmus Villemoes, Eric Biggers, Jason A. Donenfeld,
	Ard Biesheuvel, linux-kernel, kexec, linux-cifs, linux-spi,
	linux-hyperv, K. Y. Srinivasan, Haiyang Zhang, Jason Gunthorpe,
	Leon Romanovsky, Mark Brown, Steve French, Alexander Graf,
	Mike Rapoport, Konstantin Ryabitsev, Pasha Tatashin

On Fri, Mar 13, 2026 at 11:47:14AM +0200, Andy Shevchenko wrote:
> On Thu, Mar 12, 2026 at 07:08:16PM -0400, Yury Norov wrote:
> > Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired
> > to ffs(), while in 64-bit case to __ffs(). The difference is substantial:
> > ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined.
> > 
> > The 32-bit behaviour is inconsistent with the function description, so it
> > needs to get fixed.
> > 
> > There are 9 individual users for the function in 6 different subsystems.
> > Some arches and drivers are 64-bit only:
> >  - arch/loongarch/kvm/intc/eiointc.c;
> >  - drivers/hv/mshv_vtl_main.c;
> >  - kernel/liveupdate/kexec_handover.c;
> > 
> > The others are:
> >  - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct;
> >  - rzv2m_csi_reg_write_bit(): ARCH_RENESAS only, unclear;
> >  - lz77_match_len(): CIFS_COMPRESSION only, unclear, experimental;
> > 
> > None of them explicitly tweak their code for a word length, or x == 0.
> > 
> > Requesting comments from the corresponding maintainers on how to proceed
> > with this.
> > 
> > The attached patch gets rid of 32-bit explicit support, so that both
> > 32- and 64-bit versions rely on __ffs().
> 
> > CC: "K. Y. Srinivasan" <kys@microsoft.com> (hyperv)
> > CC: Haiyang Zhang <haiyangz@microsoft.com> (hyperv)
> > CC: Jason Gunthorpe <jgg@ziepe.ca> (infiniband)
> > CC: Leon Romanovsky <leon@kernel.org> (infiniband)
> > CC: Mark Brown <broonie@kernel.org> (spi)
> > CC: Steve French <sfrench@samba.org> (smb)
> > CC: Alexander Graf <graf@amazon.com> (kexec)
> > CC: Mike Rapoport <rppt@kernel.org> (kexec)
> > CC: Pasha Tatashin <pasha.tatashin@soleen.com> (kexec)
> 
> Please, move the Cc: list to the...
> 
> > Signed-off-by: Yury Norov <ynorov@nvidia.com>
> > ---
> 
> ...comments block. It will have the same effect on emails, but drastically
> reduces unneeded noise in the commit message in the Git history.

In general case, I agree. In this particular case, I want CCs to be in the
main block, and eventually got replaced with the ACKs from the proper
maintainers.
 
> You may also read this subthread (patch 18) on how to handle it locally:
> https://lore.kernel.org/linux-iio/20260123113708.416727-19-bigeasy@linutronix.de/

+ Konstantin Ryabitsev <mricon@kernel.org>

(Thanks for b4!)

Interesting thread.

So, my workflow is:

 1. git format-patch --cover-letter
 2. # Edit cover letter, add To and CC section
 3. git send-email 000* --to-cover --cc-cover
 4. b4 am 
 5. # Address nits/typos in the mbox
 6. git am 
 7. # Address substantial comments in git
 8. git format-patch -v2 --cover-letter
 9. # Edit cover letter again to restore body, To and CC sections
10. git send-email v2-000* --to-cover --cc-cover

So, yes I loose recipients on every iteration, together with the whole
cover letter. But to me it's not a big deal because I can just pull
them from my mailbox.

In the better world, I'd like to have:
git send-email -v2 000* --to-the-same-people-as-in-v1

In the perfect world, I'd prefer to keep the cover letter under the
git control, once it created, together with the recipients, once they
are added, and be able to edit them just like regular commits.

There's a 'git am -k', which is seemingly related to the matter, and
it keeps the [PATCH] prefix. But it's not what can preserve recipients
for me.

I'll try b4 prep and trailers as suggested.

Thanks,
Yury

> >  include/linux/count_zeros.h | 9 +++------
> 
> ...
> 
> > +#define COUNT_TRAILING_ZEROS_0 (-1)
> 
> Shouldn't we also saturate this to BITS_PER_LONG?
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros()
  2026-03-13 17:48   ` Yury Norov
@ 2026-03-13 18:41     ` Konstantin Ryabitsev
  0 siblings, 0 replies; 10+ messages in thread
From: Konstantin Ryabitsev @ 2026-03-13 18:41 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Yury Norov, Rasmus Villemoes, Eric Biggers,
	Jason A. Donenfeld, Ard Biesheuvel, linux-kernel, kexec,
	linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan,
	Haiyang Zhang, Jason Gunthorpe, Leon Romanovsky, Mark Brown,
	Steve French, Alexander Graf, Mike Rapoport, Pasha Tatashin

On Fri, Mar 13, 2026 at 01:48:23PM -0400, Yury Norov wrote:
> (Thanks for b4!)

\o/

> Interesting thread.
> 
> So, my workflow is:
> 
>  1. git format-patch --cover-letter
>  2. # Edit cover letter, add To and CC section
>  3. git send-email 000* --to-cover --cc-cover
>  4. b4 am 
>  5. # Address nits/typos in the mbox
>  6. git am 
>  7. # Address substantial comments in git
>  8. git format-patch -v2 --cover-letter
>  9. # Edit cover letter again to restore body, To and CC sections
> 10. git send-email v2-000* --to-cover --cc-cover

This is doing it the classic way.

> So, yes I loose recipients on every iteration, together with the whole
> cover letter. But to me it's not a big deal because I can just pull
> them from my mailbox.
> 
> In the better world, I'd like to have:
> git send-email -v2 000* --to-the-same-people-as-in-v1
> 
> In the perfect world, I'd prefer to keep the cover letter under the
> git control, once it created, together with the recipients, once they
> are added, and be able to edit them just like regular commits.
> 
> There's a 'git am -k', which is seemingly related to the matter, and
> it keeps the [PATCH] prefix. But it's not what can preserve recipients
> for me.
> 
> I'll try b4 prep and trailers as suggested.

Yes, it was created to simplify this process significantly. It's still mostly
git and email, but at least you won't have to do quite so many manual steps.

Regards,
-- 
KR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros()
  2026-03-12 23:08 [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() Yury Norov
  2026-03-12 23:54 ` Enzo Matsumiya
  2026-03-13  9:47 ` Andy Shevchenko
@ 2026-03-13 17:18 ` Jason Gunthorpe
  2026-03-13 18:14   ` Yury Norov
  2 siblings, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2026-03-13 17:18 UTC (permalink / raw)
  To: Yury Norov
  Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Eric Biggers,
	Jason A. Donenfeld, Ard Biesheuvel, linux-kernel, kexec,
	linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan,
	Haiyang Zhang, Leon Romanovsky, Mark Brown, Steve French,
	Alexander Graf, Mike Rapoport, Pasha Tatashin

On Thu, Mar 12, 2026 at 07:08:16PM -0400, Yury Norov wrote:
> Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired
> to ffs(), while in 64-bit case to __ffs(). The difference is substantial:
> ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined.
> 
> The 32-bit behaviour is inconsistent with the function description, so it
> needs to get fixed.
> 
> There are 9 individual users for the function in 6 different subsystems.
> Some arches and drivers are 64-bit only:
>  - arch/loongarch/kvm/intc/eiointc.c;
>  - drivers/hv/mshv_vtl_main.c;
>  - kernel/liveupdate/kexec_handover.c;
> 
> The others are:
>  - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct;

So long as 32 bit works the same as 64 bit it is correct for ib

Jason

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros()
  2026-03-13 17:18 ` Jason Gunthorpe
@ 2026-03-13 18:14   ` Yury Norov
  2026-03-17  9:14     ` Leon Romanovsky
  0 siblings, 1 reply; 10+ messages in thread
From: Yury Norov @ 2026-03-13 18:14 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Eric Biggers,
	Jason A. Donenfeld, Ard Biesheuvel, linux-kernel, kexec,
	linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan,
	Haiyang Zhang, Leon Romanovsky, Mark Brown, Steve French,
	Alexander Graf, Mike Rapoport, Pasha Tatashin

On Fri, Mar 13, 2026 at 02:18:55PM -0300, Jason Gunthorpe wrote:
> On Thu, Mar 12, 2026 at 07:08:16PM -0400, Yury Norov wrote:
> > Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired
> > to ffs(), while in 64-bit case to __ffs(). The difference is substantial:
> > ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined.
> > 
> > The 32-bit behaviour is inconsistent with the function description, so it
> > needs to get fixed.
> > 
> > There are 9 individual users for the function in 6 different subsystems.
> > Some arches and drivers are 64-bit only:
> >  - arch/loongarch/kvm/intc/eiointc.c;
> >  - drivers/hv/mshv_vtl_main.c;
> >  - kernel/liveupdate/kexec_handover.c;
> > 
> > The others are:
> >  - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct;
> 
> So long as 32 bit works the same as 64 bit it is correct for ib

This is what the patch does, except that it doesn't account for the
word length. In you case, 'mask' is dma_addr_t, which is u32 or u64
depending ARCH_DMA_ADDR_T_64BIT.

This config is:

        config ARCH_DMA_ADDR_T_64BIT
                def_bool 64BIT || PHYS_ADDR_T_64BIT

And PHYS_ADDR_T_64BIT is simply def_bool 64BIT. So, at least now
dma_addr_t simply follows unsigned long, and thus, the patch is
correct. But IDK what's the history behind this configurations.

Anyways, the patch aligns 32-bit count_trailing_zeros() with the
64-bit one. If you OK with that, as you said, can you please send
an explicit ack?

Thanks,
Yury

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros()
  2026-03-13 18:14   ` Yury Norov
@ 2026-03-17  9:14     ` Leon Romanovsky
  2026-03-18 16:31       ` Yury Norov
  0 siblings, 1 reply; 10+ messages in thread
From: Leon Romanovsky @ 2026-03-17  9:14 UTC (permalink / raw)
  To: Yury Norov
  Cc: Jason Gunthorpe, Yury Norov, Andy Shevchenko, Rasmus Villemoes,
	Eric Biggers, Jason A. Donenfeld, Ard Biesheuvel, linux-kernel,
	kexec, linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan,
	Haiyang Zhang, Mark Brown, Steve French, Alexander Graf,
	Mike Rapoport, Pasha Tatashin

On Fri, Mar 13, 2026 at 02:14:49PM -0400, Yury Norov wrote:
> On Fri, Mar 13, 2026 at 02:18:55PM -0300, Jason Gunthorpe wrote:
> > On Thu, Mar 12, 2026 at 07:08:16PM -0400, Yury Norov wrote:
> > > Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired
> > > to ffs(), while in 64-bit case to __ffs(). The difference is substantial:
> > > ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined.
> > > 
> > > The 32-bit behaviour is inconsistent with the function description, so it
> > > needs to get fixed.
> > > 
> > > There are 9 individual users for the function in 6 different subsystems.
> > > Some arches and drivers are 64-bit only:
> > >  - arch/loongarch/kvm/intc/eiointc.c;
> > >  - drivers/hv/mshv_vtl_main.c;
> > >  - kernel/liveupdate/kexec_handover.c;
> > > 
> > > The others are:
> > >  - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct;
> > 
> > So long as 32 bit works the same as 64 bit it is correct for ib
> 
> This is what the patch does, except that it doesn't account for the
> word length. In you case, 'mask' is dma_addr_t, which is u32 or u64
> depending ARCH_DMA_ADDR_T_64BIT.
> 
> This config is:
> 
>         config ARCH_DMA_ADDR_T_64BIT
>                 def_bool 64BIT || PHYS_ADDR_T_64BIT
> 
> And PHYS_ADDR_T_64BIT is simply def_bool 64BIT. So, at least now
> dma_addr_t simply follows unsigned long, and thus, the patch is
> correct. But IDK what's the history behind this configurations.
> 
> Anyways, the patch aligns 32-bit count_trailing_zeros() with the
> 64-bit one. If you OK with that, as you said, can you please send
> an explicit ack?

I can do that, 32 bits architectures are rarely used in the IB world.

Thanks,
Acked-by: Leon Romanovsky <leon@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros()
  2026-03-17  9:14     ` Leon Romanovsky
@ 2026-03-18 16:31       ` Yury Norov
  0 siblings, 0 replies; 10+ messages in thread
From: Yury Norov @ 2026-03-18 16:31 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jason Gunthorpe, Yury Norov, Andy Shevchenko, Rasmus Villemoes,
	Eric Biggers, Jason A. Donenfeld, Ard Biesheuvel, linux-kernel,
	kexec, linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan,
	Haiyang Zhang, Mark Brown, Steve French, Alexander Graf,
	Mike Rapoport, Pasha Tatashin

On Tue, Mar 17, 2026 at 11:14:11AM +0200, Leon Romanovsky wrote:
> On Fri, Mar 13, 2026 at 02:14:49PM -0400, Yury Norov wrote:
> > On Fri, Mar 13, 2026 at 02:18:55PM -0300, Jason Gunthorpe wrote:
> > > On Thu, Mar 12, 2026 at 07:08:16PM -0400, Yury Norov wrote:
> > > > Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired
> > > > to ffs(), while in 64-bit case to __ffs(). The difference is substantial:
> > > > ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined.
> > > > 
> > > > The 32-bit behaviour is inconsistent with the function description, so it
> > > > needs to get fixed.
> > > > 
> > > > There are 9 individual users for the function in 6 different subsystems.
> > > > Some arches and drivers are 64-bit only:
> > > >  - arch/loongarch/kvm/intc/eiointc.c;
> > > >  - drivers/hv/mshv_vtl_main.c;
> > > >  - kernel/liveupdate/kexec_handover.c;
> > > > 
> > > > The others are:
> > > >  - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct;
> > > 
> > > So long as 32 bit works the same as 64 bit it is correct for ib
> > 
> > This is what the patch does, except that it doesn't account for the
> > word length. In you case, 'mask' is dma_addr_t, which is u32 or u64
> > depending ARCH_DMA_ADDR_T_64BIT.
> > 
> > This config is:
> > 
> >         config ARCH_DMA_ADDR_T_64BIT
> >                 def_bool 64BIT || PHYS_ADDR_T_64BIT
> > 
> > And PHYS_ADDR_T_64BIT is simply def_bool 64BIT. So, at least now
> > dma_addr_t simply follows unsigned long, and thus, the patch is
> > correct. But IDK what's the history behind this configurations.
> > 
> > Anyways, the patch aligns 32-bit count_trailing_zeros() with the
> > 64-bit one. If you OK with that, as you said, can you please send
> > an explicit ack?
> 
> I can do that, 32 bits architectures are rarely used in the IB world.
> 
> Thanks,
> Acked-by: Leon Romanovsky <leon@kernel.org>

Thanks, Leon. Seemingly no headwinds for the patch. Taking in in -next
for testing.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-03-18 16:32 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-12 23:08 [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() Yury Norov
2026-03-12 23:54 ` Enzo Matsumiya
2026-03-13 16:31   ` Yury Norov
2026-03-13  9:47 ` Andy Shevchenko
2026-03-13 17:48   ` Yury Norov
2026-03-13 18:41     ` Konstantin Ryabitsev
2026-03-13 17:18 ` Jason Gunthorpe
2026-03-13 18:14   ` Yury Norov
2026-03-17  9:14     ` Leon Romanovsky
2026-03-18 16:31       ` Yury Norov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox