* [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros()
@ 2026-03-12 23:08 Yury Norov
2026-03-12 23:54 ` Enzo Matsumiya
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Yury Norov @ 2026-03-12 23:08 UTC (permalink / raw)
To: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Eric Biggers,
Jason A. Donenfeld, Ard Biesheuvel
Cc: Yury Norov, linux-kernel, kexec, linux-cifs, linux-spi,
linux-hyperv, K. Y. Srinivasan, Haiyang Zhang, Jason Gunthorpe,
Leon Romanovsky, Mark Brown, Steve French, Alexander Graf,
Mike Rapoport, Pasha Tatashin
Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired
to ffs(), while in 64-bit case to __ffs(). The difference is substantial:
ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined.
The 32-bit behaviour is inconsistent with the function description, so it
needs to get fixed.
There are 9 individual users for the function in 6 different subsystems.
Some arches and drivers are 64-bit only:
- arch/loongarch/kvm/intc/eiointc.c;
- drivers/hv/mshv_vtl_main.c;
- kernel/liveupdate/kexec_handover.c;
The others are:
- ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct;
- rzv2m_csi_reg_write_bit(): ARCH_RENESAS only, unclear;
- lz77_match_len(): CIFS_COMPRESSION only, unclear, experimental;
None of them explicitly tweak their code for a word length, or x == 0.
Requesting comments from the corresponding maintainers on how to proceed
with this.
The attached patch gets rid of 32-bit explicit support, so that both
32- and 64-bit versions rely on __ffs().
CC: "K. Y. Srinivasan" <kys@microsoft.com> (hyperv)
CC: Haiyang Zhang <haiyangz@microsoft.com> (hyperv)
CC: Jason Gunthorpe <jgg@ziepe.ca> (infiniband)
CC: Leon Romanovsky <leon@kernel.org> (infiniband)
CC: Mark Brown <broonie@kernel.org> (spi)
CC: Steve French <sfrench@samba.org> (smb)
CC: Alexander Graf <graf@amazon.com> (kexec)
CC: Mike Rapoport <rppt@kernel.org> (kexec)
CC: Pasha Tatashin <pasha.tatashin@soleen.com> (kexec)
Signed-off-by: Yury Norov <ynorov@nvidia.com>
---
include/linux/count_zeros.h | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/include/linux/count_zeros.h b/include/linux/count_zeros.h
index 4e5680327ece..5034a30b5c7c 100644
--- a/include/linux/count_zeros.h
+++ b/include/linux/count_zeros.h
@@ -10,6 +10,8 @@
#include <asm/bitops.h>
+#define COUNT_TRAILING_ZEROS_0 (-1)
+
/**
* count_leading_zeros - Count the number of zeros from the MSB back
* @x: The value
@@ -40,12 +42,7 @@ static inline int count_leading_zeros(unsigned long x)
*/
static inline int count_trailing_zeros(unsigned long x)
{
-#define COUNT_TRAILING_ZEROS_0 (-1)
-
- if (sizeof(x) == 4)
- return ffs(x);
- else
- return (x != 0) ? __ffs(x) : COUNT_TRAILING_ZEROS_0;
+ return (x != 0) ? __ffs(x) : COUNT_TRAILING_ZEROS_0;
}
#endif /* _LINUX_BITOPS_COUNT_ZEROS_H_ */
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() 2026-03-12 23:08 [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() Yury Norov @ 2026-03-12 23:54 ` Enzo Matsumiya 2026-03-13 16:31 ` Yury Norov 2026-03-13 9:47 ` Andy Shevchenko 2026-03-13 17:18 ` Jason Gunthorpe 2 siblings, 1 reply; 10+ messages in thread From: Enzo Matsumiya @ 2026-03-12 23:54 UTC (permalink / raw) To: Yury Norov Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Eric Biggers, Jason A. Donenfeld, Ard Biesheuvel, linux-kernel, kexec, linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan, Haiyang Zhang, Jason Gunthorpe, Leon Romanovsky, Mark Brown, Steve French, Alexander Graf, Mike Rapoport, Pasha Tatashin Hi Yury, On 03/12, Yury Norov wrote: >Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired >to ffs(), while in 64-bit case to __ffs(). The difference is substantial: >ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined. > >The 32-bit behaviour is inconsistent with the function description, so it >needs to get fixed. > >There are 9 individual users for the function in 6 different subsystems. >Some arches and drivers are 64-bit only: > - arch/loongarch/kvm/intc/eiointc.c; > - drivers/hv/mshv_vtl_main.c; > - kernel/liveupdate/kexec_handover.c; > >The others are: > - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct; > - rzv2m_csi_reg_write_bit(): ARCH_RENESAS only, unclear; > - lz77_match_len(): CIFS_COMPRESSION only, unclear, experimental; > >None of them explicitly tweak their code for a word length, or x == 0. Context for lz77_match_len() case: const u64 diff = lz77_read64(cur) ^ lz77_read64(wnd); if (!diff) { ... } cur += count_trailing_zeros(diff) >> 3; So x == 0 is checked, however it does assume that sizeof(unsigned long) == sizeof(u64). I'll have to fix it for when that's not the case (even with your patch in, as __ffs() casts x to unsigned long down the line). Thanks for the heads up. Cheers, Enzo >Requesting comments from the corresponding maintainers on how to proceed >with this. > >The attached patch gets rid of 32-bit explicit support, so that both >32- and 64-bit versions rely on __ffs(). > >CC: "K. Y. Srinivasan" <kys@microsoft.com> (hyperv) >CC: Haiyang Zhang <haiyangz@microsoft.com> (hyperv) >CC: Jason Gunthorpe <jgg@ziepe.ca> (infiniband) >CC: Leon Romanovsky <leon@kernel.org> (infiniband) >CC: Mark Brown <broonie@kernel.org> (spi) >CC: Steve French <sfrench@samba.org> (smb) >CC: Alexander Graf <graf@amazon.com> (kexec) >CC: Mike Rapoport <rppt@kernel.org> (kexec) >CC: Pasha Tatashin <pasha.tatashin@soleen.com> (kexec) >Signed-off-by: Yury Norov <ynorov@nvidia.com> >--- > include/linux/count_zeros.h | 9 +++------ > 1 file changed, 3 insertions(+), 6 deletions(-) > >diff --git a/include/linux/count_zeros.h b/include/linux/count_zeros.h >index 4e5680327ece..5034a30b5c7c 100644 >--- a/include/linux/count_zeros.h >+++ b/include/linux/count_zeros.h >@@ -10,6 +10,8 @@ > > #include <asm/bitops.h> > >+#define COUNT_TRAILING_ZEROS_0 (-1) >+ > /** > * count_leading_zeros - Count the number of zeros from the MSB back > * @x: The value >@@ -40,12 +42,7 @@ static inline int count_leading_zeros(unsigned long x) > */ > static inline int count_trailing_zeros(unsigned long x) > { >-#define COUNT_TRAILING_ZEROS_0 (-1) >- >- if (sizeof(x) == 4) >- return ffs(x); >- else >- return (x != 0) ? __ffs(x) : COUNT_TRAILING_ZEROS_0; >+ return (x != 0) ? __ffs(x) : COUNT_TRAILING_ZEROS_0; > } > > #endif /* _LINUX_BITOPS_COUNT_ZEROS_H_ */ >-- >2.43.0 > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() 2026-03-12 23:54 ` Enzo Matsumiya @ 2026-03-13 16:31 ` Yury Norov 0 siblings, 0 replies; 10+ messages in thread From: Yury Norov @ 2026-03-13 16:31 UTC (permalink / raw) To: Enzo Matsumiya Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Eric Biggers, Jason A. Donenfeld, Ard Biesheuvel, linux-kernel, kexec, linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan, Haiyang Zhang, Jason Gunthorpe, Leon Romanovsky, Mark Brown, Steve French, Alexander Graf, Mike Rapoport, Pasha Tatashin On Thu, Mar 12, 2026 at 08:54:29PM -0300, Enzo Matsumiya wrote: > Hi Yury, > > On 03/12, Yury Norov wrote: > > Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired > > to ffs(), while in 64-bit case to __ffs(). The difference is substantial: > > ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined. > > > > The 32-bit behaviour is inconsistent with the function description, so it > > needs to get fixed. > > > > There are 9 individual users for the function in 6 different subsystems. > > Some arches and drivers are 64-bit only: > > - arch/loongarch/kvm/intc/eiointc.c; > > - drivers/hv/mshv_vtl_main.c; > > - kernel/liveupdate/kexec_handover.c; > > > > The others are: > > - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct; > > - rzv2m_csi_reg_write_bit(): ARCH_RENESAS only, unclear; > > - lz77_match_len(): CIFS_COMPRESSION only, unclear, experimental; > > > > None of them explicitly tweak their code for a word length, or x == 0. > > Context for lz77_match_len() case: > > const u64 diff = lz77_read64(cur) ^ lz77_read64(wnd); > > if (!diff) { > ... > } > > cur += count_trailing_zeros(diff) >> 3; > > So x == 0 is checked, however it does assume that > sizeof(unsigned long) == sizeof(u64). I'll have to fix it for when > that's not the case (even with your patch in, as __ffs() casts x to > unsigned long down the line). Thanks for the heads up. Yes, in your case you need __ffs64() (which doesn't exist). Or simply leverage bitmaps API: DECLARE_BITMAP(bitmap, 64); ... bitmap_from_u64(bitmap, diff); cur += find_first_bit(bitmap, 64) >> 3; > > Cheers, > > Enzo So, if you're not objecting to wire 32-bit count_trailing_zeros() to __ffs(), can you please send your ack? Thanks, Yury ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() 2026-03-12 23:08 [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() Yury Norov 2026-03-12 23:54 ` Enzo Matsumiya @ 2026-03-13 9:47 ` Andy Shevchenko 2026-03-13 17:48 ` Yury Norov 2026-03-13 17:18 ` Jason Gunthorpe 2 siblings, 1 reply; 10+ messages in thread From: Andy Shevchenko @ 2026-03-13 9:47 UTC (permalink / raw) To: Yury Norov Cc: Yury Norov, Rasmus Villemoes, Eric Biggers, Jason A. Donenfeld, Ard Biesheuvel, linux-kernel, kexec, linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan, Haiyang Zhang, Jason Gunthorpe, Leon Romanovsky, Mark Brown, Steve French, Alexander Graf, Mike Rapoport, Pasha Tatashin On Thu, Mar 12, 2026 at 07:08:16PM -0400, Yury Norov wrote: > Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired > to ffs(), while in 64-bit case to __ffs(). The difference is substantial: > ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined. > > The 32-bit behaviour is inconsistent with the function description, so it > needs to get fixed. > > There are 9 individual users for the function in 6 different subsystems. > Some arches and drivers are 64-bit only: > - arch/loongarch/kvm/intc/eiointc.c; > - drivers/hv/mshv_vtl_main.c; > - kernel/liveupdate/kexec_handover.c; > > The others are: > - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct; > - rzv2m_csi_reg_write_bit(): ARCH_RENESAS only, unclear; > - lz77_match_len(): CIFS_COMPRESSION only, unclear, experimental; > > None of them explicitly tweak their code for a word length, or x == 0. > > Requesting comments from the corresponding maintainers on how to proceed > with this. > > The attached patch gets rid of 32-bit explicit support, so that both > 32- and 64-bit versions rely on __ffs(). > CC: "K. Y. Srinivasan" <kys@microsoft.com> (hyperv) > CC: Haiyang Zhang <haiyangz@microsoft.com> (hyperv) > CC: Jason Gunthorpe <jgg@ziepe.ca> (infiniband) > CC: Leon Romanovsky <leon@kernel.org> (infiniband) > CC: Mark Brown <broonie@kernel.org> (spi) > CC: Steve French <sfrench@samba.org> (smb) > CC: Alexander Graf <graf@amazon.com> (kexec) > CC: Mike Rapoport <rppt@kernel.org> (kexec) > CC: Pasha Tatashin <pasha.tatashin@soleen.com> (kexec) Please, move the Cc: list to the... > Signed-off-by: Yury Norov <ynorov@nvidia.com> > --- ...comments block. It will have the same effect on emails, but drastically reduces unneeded noise in the commit message in the Git history. You may also read this subthread (patch 18) on how to handle it locally: https://lore.kernel.org/linux-iio/20260123113708.416727-19-bigeasy@linutronix.de/ > include/linux/count_zeros.h | 9 +++------ ... > +#define COUNT_TRAILING_ZEROS_0 (-1) Shouldn't we also saturate this to BITS_PER_LONG? -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() 2026-03-13 9:47 ` Andy Shevchenko @ 2026-03-13 17:48 ` Yury Norov 2026-03-13 18:41 ` Konstantin Ryabitsev 0 siblings, 1 reply; 10+ messages in thread From: Yury Norov @ 2026-03-13 17:48 UTC (permalink / raw) To: Andy Shevchenko Cc: Yury Norov, Rasmus Villemoes, Eric Biggers, Jason A. Donenfeld, Ard Biesheuvel, linux-kernel, kexec, linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan, Haiyang Zhang, Jason Gunthorpe, Leon Romanovsky, Mark Brown, Steve French, Alexander Graf, Mike Rapoport, Konstantin Ryabitsev, Pasha Tatashin On Fri, Mar 13, 2026 at 11:47:14AM +0200, Andy Shevchenko wrote: > On Thu, Mar 12, 2026 at 07:08:16PM -0400, Yury Norov wrote: > > Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired > > to ffs(), while in 64-bit case to __ffs(). The difference is substantial: > > ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined. > > > > The 32-bit behaviour is inconsistent with the function description, so it > > needs to get fixed. > > > > There are 9 individual users for the function in 6 different subsystems. > > Some arches and drivers are 64-bit only: > > - arch/loongarch/kvm/intc/eiointc.c; > > - drivers/hv/mshv_vtl_main.c; > > - kernel/liveupdate/kexec_handover.c; > > > > The others are: > > - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct; > > - rzv2m_csi_reg_write_bit(): ARCH_RENESAS only, unclear; > > - lz77_match_len(): CIFS_COMPRESSION only, unclear, experimental; > > > > None of them explicitly tweak their code for a word length, or x == 0. > > > > Requesting comments from the corresponding maintainers on how to proceed > > with this. > > > > The attached patch gets rid of 32-bit explicit support, so that both > > 32- and 64-bit versions rely on __ffs(). > > > CC: "K. Y. Srinivasan" <kys@microsoft.com> (hyperv) > > CC: Haiyang Zhang <haiyangz@microsoft.com> (hyperv) > > CC: Jason Gunthorpe <jgg@ziepe.ca> (infiniband) > > CC: Leon Romanovsky <leon@kernel.org> (infiniband) > > CC: Mark Brown <broonie@kernel.org> (spi) > > CC: Steve French <sfrench@samba.org> (smb) > > CC: Alexander Graf <graf@amazon.com> (kexec) > > CC: Mike Rapoport <rppt@kernel.org> (kexec) > > CC: Pasha Tatashin <pasha.tatashin@soleen.com> (kexec) > > Please, move the Cc: list to the... > > > Signed-off-by: Yury Norov <ynorov@nvidia.com> > > --- > > ...comments block. It will have the same effect on emails, but drastically > reduces unneeded noise in the commit message in the Git history. In general case, I agree. In this particular case, I want CCs to be in the main block, and eventually got replaced with the ACKs from the proper maintainers. > You may also read this subthread (patch 18) on how to handle it locally: > https://lore.kernel.org/linux-iio/20260123113708.416727-19-bigeasy@linutronix.de/ + Konstantin Ryabitsev <mricon@kernel.org> (Thanks for b4!) Interesting thread. So, my workflow is: 1. git format-patch --cover-letter 2. # Edit cover letter, add To and CC section 3. git send-email 000* --to-cover --cc-cover 4. b4 am 5. # Address nits/typos in the mbox 6. git am 7. # Address substantial comments in git 8. git format-patch -v2 --cover-letter 9. # Edit cover letter again to restore body, To and CC sections 10. git send-email v2-000* --to-cover --cc-cover So, yes I loose recipients on every iteration, together with the whole cover letter. But to me it's not a big deal because I can just pull them from my mailbox. In the better world, I'd like to have: git send-email -v2 000* --to-the-same-people-as-in-v1 In the perfect world, I'd prefer to keep the cover letter under the git control, once it created, together with the recipients, once they are added, and be able to edit them just like regular commits. There's a 'git am -k', which is seemingly related to the matter, and it keeps the [PATCH] prefix. But it's not what can preserve recipients for me. I'll try b4 prep and trailers as suggested. Thanks, Yury > > include/linux/count_zeros.h | 9 +++------ > > ... > > > +#define COUNT_TRAILING_ZEROS_0 (-1) > > Shouldn't we also saturate this to BITS_PER_LONG? > > -- > With Best Regards, > Andy Shevchenko > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() 2026-03-13 17:48 ` Yury Norov @ 2026-03-13 18:41 ` Konstantin Ryabitsev 0 siblings, 0 replies; 10+ messages in thread From: Konstantin Ryabitsev @ 2026-03-13 18:41 UTC (permalink / raw) To: Yury Norov Cc: Andy Shevchenko, Yury Norov, Rasmus Villemoes, Eric Biggers, Jason A. Donenfeld, Ard Biesheuvel, linux-kernel, kexec, linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan, Haiyang Zhang, Jason Gunthorpe, Leon Romanovsky, Mark Brown, Steve French, Alexander Graf, Mike Rapoport, Pasha Tatashin On Fri, Mar 13, 2026 at 01:48:23PM -0400, Yury Norov wrote: > (Thanks for b4!) \o/ > Interesting thread. > > So, my workflow is: > > 1. git format-patch --cover-letter > 2. # Edit cover letter, add To and CC section > 3. git send-email 000* --to-cover --cc-cover > 4. b4 am > 5. # Address nits/typos in the mbox > 6. git am > 7. # Address substantial comments in git > 8. git format-patch -v2 --cover-letter > 9. # Edit cover letter again to restore body, To and CC sections > 10. git send-email v2-000* --to-cover --cc-cover This is doing it the classic way. > So, yes I loose recipients on every iteration, together with the whole > cover letter. But to me it's not a big deal because I can just pull > them from my mailbox. > > In the better world, I'd like to have: > git send-email -v2 000* --to-the-same-people-as-in-v1 > > In the perfect world, I'd prefer to keep the cover letter under the > git control, once it created, together with the recipients, once they > are added, and be able to edit them just like regular commits. > > There's a 'git am -k', which is seemingly related to the matter, and > it keeps the [PATCH] prefix. But it's not what can preserve recipients > for me. > > I'll try b4 prep and trailers as suggested. Yes, it was created to simplify this process significantly. It's still mostly git and email, but at least you won't have to do quite so many manual steps. Regards, -- KR ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() 2026-03-12 23:08 [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() Yury Norov 2026-03-12 23:54 ` Enzo Matsumiya 2026-03-13 9:47 ` Andy Shevchenko @ 2026-03-13 17:18 ` Jason Gunthorpe 2026-03-13 18:14 ` Yury Norov 2 siblings, 1 reply; 10+ messages in thread From: Jason Gunthorpe @ 2026-03-13 17:18 UTC (permalink / raw) To: Yury Norov Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Eric Biggers, Jason A. Donenfeld, Ard Biesheuvel, linux-kernel, kexec, linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan, Haiyang Zhang, Leon Romanovsky, Mark Brown, Steve French, Alexander Graf, Mike Rapoport, Pasha Tatashin On Thu, Mar 12, 2026 at 07:08:16PM -0400, Yury Norov wrote: > Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired > to ffs(), while in 64-bit case to __ffs(). The difference is substantial: > ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined. > > The 32-bit behaviour is inconsistent with the function description, so it > needs to get fixed. > > There are 9 individual users for the function in 6 different subsystems. > Some arches and drivers are 64-bit only: > - arch/loongarch/kvm/intc/eiointc.c; > - drivers/hv/mshv_vtl_main.c; > - kernel/liveupdate/kexec_handover.c; > > The others are: > - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct; So long as 32 bit works the same as 64 bit it is correct for ib Jason ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() 2026-03-13 17:18 ` Jason Gunthorpe @ 2026-03-13 18:14 ` Yury Norov 2026-03-17 9:14 ` Leon Romanovsky 0 siblings, 1 reply; 10+ messages in thread From: Yury Norov @ 2026-03-13 18:14 UTC (permalink / raw) To: Jason Gunthorpe Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Eric Biggers, Jason A. Donenfeld, Ard Biesheuvel, linux-kernel, kexec, linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan, Haiyang Zhang, Leon Romanovsky, Mark Brown, Steve French, Alexander Graf, Mike Rapoport, Pasha Tatashin On Fri, Mar 13, 2026 at 02:18:55PM -0300, Jason Gunthorpe wrote: > On Thu, Mar 12, 2026 at 07:08:16PM -0400, Yury Norov wrote: > > Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired > > to ffs(), while in 64-bit case to __ffs(). The difference is substantial: > > ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined. > > > > The 32-bit behaviour is inconsistent with the function description, so it > > needs to get fixed. > > > > There are 9 individual users for the function in 6 different subsystems. > > Some arches and drivers are 64-bit only: > > - arch/loongarch/kvm/intc/eiointc.c; > > - drivers/hv/mshv_vtl_main.c; > > - kernel/liveupdate/kexec_handover.c; > > > > The others are: > > - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct; > > So long as 32 bit works the same as 64 bit it is correct for ib This is what the patch does, except that it doesn't account for the word length. In you case, 'mask' is dma_addr_t, which is u32 or u64 depending ARCH_DMA_ADDR_T_64BIT. This config is: config ARCH_DMA_ADDR_T_64BIT def_bool 64BIT || PHYS_ADDR_T_64BIT And PHYS_ADDR_T_64BIT is simply def_bool 64BIT. So, at least now dma_addr_t simply follows unsigned long, and thus, the patch is correct. But IDK what's the history behind this configurations. Anyways, the patch aligns 32-bit count_trailing_zeros() with the 64-bit one. If you OK with that, as you said, can you please send an explicit ack? Thanks, Yury ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() 2026-03-13 18:14 ` Yury Norov @ 2026-03-17 9:14 ` Leon Romanovsky 2026-03-18 16:31 ` Yury Norov 0 siblings, 1 reply; 10+ messages in thread From: Leon Romanovsky @ 2026-03-17 9:14 UTC (permalink / raw) To: Yury Norov Cc: Jason Gunthorpe, Yury Norov, Andy Shevchenko, Rasmus Villemoes, Eric Biggers, Jason A. Donenfeld, Ard Biesheuvel, linux-kernel, kexec, linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan, Haiyang Zhang, Mark Brown, Steve French, Alexander Graf, Mike Rapoport, Pasha Tatashin On Fri, Mar 13, 2026 at 02:14:49PM -0400, Yury Norov wrote: > On Fri, Mar 13, 2026 at 02:18:55PM -0300, Jason Gunthorpe wrote: > > On Thu, Mar 12, 2026 at 07:08:16PM -0400, Yury Norov wrote: > > > Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired > > > to ffs(), while in 64-bit case to __ffs(). The difference is substantial: > > > ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined. > > > > > > The 32-bit behaviour is inconsistent with the function description, so it > > > needs to get fixed. > > > > > > There are 9 individual users for the function in 6 different subsystems. > > > Some arches and drivers are 64-bit only: > > > - arch/loongarch/kvm/intc/eiointc.c; > > > - drivers/hv/mshv_vtl_main.c; > > > - kernel/liveupdate/kexec_handover.c; > > > > > > The others are: > > > - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct; > > > > So long as 32 bit works the same as 64 bit it is correct for ib > > This is what the patch does, except that it doesn't account for the > word length. In you case, 'mask' is dma_addr_t, which is u32 or u64 > depending ARCH_DMA_ADDR_T_64BIT. > > This config is: > > config ARCH_DMA_ADDR_T_64BIT > def_bool 64BIT || PHYS_ADDR_T_64BIT > > And PHYS_ADDR_T_64BIT is simply def_bool 64BIT. So, at least now > dma_addr_t simply follows unsigned long, and thus, the patch is > correct. But IDK what's the history behind this configurations. > > Anyways, the patch aligns 32-bit count_trailing_zeros() with the > 64-bit one. If you OK with that, as you said, can you please send > an explicit ack? I can do that, 32 bits architectures are rarely used in the IB world. Thanks, Acked-by: Leon Romanovsky <leon@kernel.org> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() 2026-03-17 9:14 ` Leon Romanovsky @ 2026-03-18 16:31 ` Yury Norov 0 siblings, 0 replies; 10+ messages in thread From: Yury Norov @ 2026-03-18 16:31 UTC (permalink / raw) To: Leon Romanovsky Cc: Jason Gunthorpe, Yury Norov, Andy Shevchenko, Rasmus Villemoes, Eric Biggers, Jason A. Donenfeld, Ard Biesheuvel, linux-kernel, kexec, linux-cifs, linux-spi, linux-hyperv, K. Y. Srinivasan, Haiyang Zhang, Mark Brown, Steve French, Alexander Graf, Mike Rapoport, Pasha Tatashin On Tue, Mar 17, 2026 at 11:14:11AM +0200, Leon Romanovsky wrote: > On Fri, Mar 13, 2026 at 02:14:49PM -0400, Yury Norov wrote: > > On Fri, Mar 13, 2026 at 02:18:55PM -0300, Jason Gunthorpe wrote: > > > On Thu, Mar 12, 2026 at 07:08:16PM -0400, Yury Norov wrote: > > > > Based on 'sizeof(x) == 4' condition, in 32-bit case the function is wired > > > > to ffs(), while in 64-bit case to __ffs(). The difference is substantial: > > > > ffs(x) == __ffs(x) + 1. Also, ffs(0) == 0, while __ffs(0) is undefined. > > > > > > > > The 32-bit behaviour is inconsistent with the function description, so it > > > > needs to get fixed. > > > > > > > > There are 9 individual users for the function in 6 different subsystems. > > > > Some arches and drivers are 64-bit only: > > > > - arch/loongarch/kvm/intc/eiointc.c; > > > > - drivers/hv/mshv_vtl_main.c; > > > > - kernel/liveupdate/kexec_handover.c; > > > > > > > > The others are: > > > > - ib_umem_find_best_pgsz(): as per comment, __ffs() should be correct; > > > > > > So long as 32 bit works the same as 64 bit it is correct for ib > > > > This is what the patch does, except that it doesn't account for the > > word length. In you case, 'mask' is dma_addr_t, which is u32 or u64 > > depending ARCH_DMA_ADDR_T_64BIT. > > > > This config is: > > > > config ARCH_DMA_ADDR_T_64BIT > > def_bool 64BIT || PHYS_ADDR_T_64BIT > > > > And PHYS_ADDR_T_64BIT is simply def_bool 64BIT. So, at least now > > dma_addr_t simply follows unsigned long, and thus, the patch is > > correct. But IDK what's the history behind this configurations. > > > > Anyways, the patch aligns 32-bit count_trailing_zeros() with the > > 64-bit one. If you OK with that, as you said, can you please send > > an explicit ack? > > I can do that, 32 bits architectures are rarely used in the IB world. > > Thanks, > Acked-by: Leon Romanovsky <leon@kernel.org> Thanks, Leon. Seemingly no headwinds for the patch. Taking in in -next for testing. Thanks, Yury ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-03-18 16:32 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-12 23:08 [PATCH] lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros() Yury Norov 2026-03-12 23:54 ` Enzo Matsumiya 2026-03-13 16:31 ` Yury Norov 2026-03-13 9:47 ` Andy Shevchenko 2026-03-13 17:48 ` Yury Norov 2026-03-13 18:41 ` Konstantin Ryabitsev 2026-03-13 17:18 ` Jason Gunthorpe 2026-03-13 18:14 ` Yury Norov 2026-03-17 9:14 ` Leon Romanovsky 2026-03-18 16:31 ` Yury Norov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox