rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] drm/panic: Add a u64 divide by 10 for arm32
@ 2025-06-27 12:38 Jocelyn Falempe
  2025-08-01  8:52 ` Jocelyn Falempe
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Jocelyn Falempe @ 2025-06-27 12:38 UTC (permalink / raw)
  To: Andrei Lalaev, Miguel Ojeda, Christian Schrefl, Arnd Bergmann,
	Russell King, Paolo Bonzini, rust-for-linux, Linux ARM,
	Thomas Zimmermann, Javier Martinez Canillas, Maarten Lankhorst,
	Maxime Ripard, David Airlie, Simona Vetter, dri-devel
  Cc: Jocelyn Falempe

On 32bits ARM, u64 divided by a constant is not optimized to a
multiply by inverse by the compiler [1].
So do the multiply by inverse explicitly for this architecture.

Link: https://github.com/llvm/llvm-project/issues/37280 [1]
Reported-by: Andrei Lalaev <andrey.lalaev@gmail.com>
Closes: https://lore.kernel.org/dri-devel/c0a2771c-f3f5-4d4c-aa82-d673b3c5cb46@gmail.com/
Fixes: 675008f196ca ("drm/panic: Use a decimal fifo to avoid u64 by u64 divide")
Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
---
 drivers/gpu/drm/drm_panic_qr.rs | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_panic_qr.rs b/drivers/gpu/drm/drm_panic_qr.rs
index dd55b1cb764d..774a17de4f2f 100644
--- a/drivers/gpu/drm/drm_panic_qr.rs
+++ b/drivers/gpu/drm/drm_panic_qr.rs
@@ -381,6 +381,26 @@ struct DecFifo {
     len: usize,
 }
 
+// On arm32 architecture, dividing an `u64` by a constant will generate a call
+// to `__aeabi_uldivmod` which is not present in the kernel.
+// So use the multiply by inverse method for this architecture.
+fn div10(val: u64) -> u64 {
+    if cfg!(target_arch = "arm") {
+        let val_h = val >> 32;
+        let val_l = val & 0xFFFFFFFF;
+        let b_h: u64 = 0x66666666;
+        let b_l: u64 = 0x66666667;
+
+        let tmp1 = val_h * b_l + ((val_l * b_l) >> 32);
+        let tmp2 = val_l * b_h + (tmp1 & 0xffffffff);
+        let tmp3 = val_h * b_h + (tmp1 >> 32) + (tmp2 >> 32);
+
+        tmp3 >> 2
+    } else {
+        val / 10
+    }
+}
+
 impl DecFifo {
     fn push(&mut self, data: u64, len: usize) {
         let mut chunk = data;
@@ -389,7 +409,7 @@ fn push(&mut self, data: u64, len: usize) {
         }
         for i in 0..len {
             self.decimals[i] = (chunk % 10) as u8;
-            chunk /= 10;
+            chunk = div10(chunk);
         }
         self.len += len;
     }

base-commit: 3529cb5ab16b4f1f8bbc31dc39a1076a94bd1e38
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] drm/panic: Add a u64 divide by 10 for arm32
  2025-06-27 12:38 [PATCH v2] drm/panic: Add a u64 divide by 10 for arm32 Jocelyn Falempe
@ 2025-08-01  8:52 ` Jocelyn Falempe
  2025-08-01  9:03 ` Alice Ryhl
  2025-08-01  9:29 ` Thomas Weißschuh
  2 siblings, 0 replies; 4+ messages in thread
From: Jocelyn Falempe @ 2025-08-01  8:52 UTC (permalink / raw)
  To: Andrei Lalaev, Miguel Ojeda, Christian Schrefl, Arnd Bergmann,
	Russell King, Paolo Bonzini, rust-for-linux, Linux ARM,
	Thomas Zimmermann, Javier Martinez Canillas, Maarten Lankhorst,
	Maxime Ripard, David Airlie, Simona Vetter, dri-devel

On 27/06/2025 14:38, Jocelyn Falempe wrote:
> On 32bits ARM, u64 divided by a constant is not optimized to a
> multiply by inverse by the compiler [1].
> So do the multiply by inverse explicitly for this architecture.

Gentle ping.


Best regards,

-- 

Jocelyn

> 
> Link: https://github.com/llvm/llvm-project/issues/37280 [1]
> Reported-by: Andrei Lalaev <andrey.lalaev@gmail.com>
> Closes: https://lore.kernel.org/dri-devel/c0a2771c-f3f5-4d4c-aa82-d673b3c5cb46@gmail.com/
> Fixes: 675008f196ca ("drm/panic: Use a decimal fifo to avoid u64 by u64 divide")
> Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
> ---
>   drivers/gpu/drm/drm_panic_qr.rs | 22 +++++++++++++++++++++-
>   1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_panic_qr.rs b/drivers/gpu/drm/drm_panic_qr.rs
> index dd55b1cb764d..774a17de4f2f 100644
> --- a/drivers/gpu/drm/drm_panic_qr.rs
> +++ b/drivers/gpu/drm/drm_panic_qr.rs
> @@ -381,6 +381,26 @@ struct DecFifo {
>       len: usize,
>   }
>   
> +// On arm32 architecture, dividing an `u64` by a constant will generate a call
> +// to `__aeabi_uldivmod` which is not present in the kernel.
> +// So use the multiply by inverse method for this architecture.
> +fn div10(val: u64) -> u64 {
> +    if cfg!(target_arch = "arm") {
> +        let val_h = val >> 32;
> +        let val_l = val & 0xFFFFFFFF;
> +        let b_h: u64 = 0x66666666;
> +        let b_l: u64 = 0x66666667;
> +
> +        let tmp1 = val_h * b_l + ((val_l * b_l) >> 32);
> +        let tmp2 = val_l * b_h + (tmp1 & 0xffffffff);
> +        let tmp3 = val_h * b_h + (tmp1 >> 32) + (tmp2 >> 32);
> +
> +        tmp3 >> 2
> +    } else {
> +        val / 10
> +    }
> +}
> +
>   impl DecFifo {
>       fn push(&mut self, data: u64, len: usize) {
>           let mut chunk = data;
> @@ -389,7 +409,7 @@ fn push(&mut self, data: u64, len: usize) {
>           }
>           for i in 0..len {
>               self.decimals[i] = (chunk % 10) as u8;
> -            chunk /= 10;
> +            chunk = div10(chunk);
>           }
>           self.len += len;
>       }
> 
> base-commit: 3529cb5ab16b4f1f8bbc31dc39a1076a94bd1e38


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] drm/panic: Add a u64 divide by 10 for arm32
  2025-06-27 12:38 [PATCH v2] drm/panic: Add a u64 divide by 10 for arm32 Jocelyn Falempe
  2025-08-01  8:52 ` Jocelyn Falempe
@ 2025-08-01  9:03 ` Alice Ryhl
  2025-08-01  9:29 ` Thomas Weißschuh
  2 siblings, 0 replies; 4+ messages in thread
From: Alice Ryhl @ 2025-08-01  9:03 UTC (permalink / raw)
  To: Jocelyn Falempe
  Cc: Andrei Lalaev, Miguel Ojeda, Christian Schrefl, Arnd Bergmann,
	Russell King, Paolo Bonzini, rust-for-linux, Linux ARM,
	Thomas Zimmermann, Javier Martinez Canillas, Maarten Lankhorst,
	Maxime Ripard, David Airlie, Simona Vetter, dri-devel

On Fri, Jun 27, 2025 at 02:38:19PM +0200, Jocelyn Falempe wrote:
> On 32bits ARM, u64 divided by a constant is not optimized to a
> multiply by inverse by the compiler [1].
> So do the multiply by inverse explicitly for this architecture.
> 
> Link: https://github.com/llvm/llvm-project/issues/37280 [1]
> Reported-by: Andrei Lalaev <andrey.lalaev@gmail.com>
> Closes: https://lore.kernel.org/dri-devel/c0a2771c-f3f5-4d4c-aa82-d673b3c5cb46@gmail.com/
> Fixes: 675008f196ca ("drm/panic: Use a decimal fifo to avoid u64 by u64 divide")
> Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>

Reviewed-by: Alice Ryhl <aliceryhl@google.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] drm/panic: Add a u64 divide by 10 for arm32
  2025-06-27 12:38 [PATCH v2] drm/panic: Add a u64 divide by 10 for arm32 Jocelyn Falempe
  2025-08-01  8:52 ` Jocelyn Falempe
  2025-08-01  9:03 ` Alice Ryhl
@ 2025-08-01  9:29 ` Thomas Weißschuh
  2 siblings, 0 replies; 4+ messages in thread
From: Thomas Weißschuh @ 2025-08-01  9:29 UTC (permalink / raw)
  To: Jocelyn Falempe
  Cc: Andrei Lalaev, Miguel Ojeda, Christian Schrefl, Arnd Bergmann,
	Russell King, Paolo Bonzini, rust-for-linux, Linux ARM,
	Thomas Zimmermann, Javier Martinez Canillas, Maarten Lankhorst,
	Maxime Ripard, David Airlie, Simona Vetter, dri-devel

On Fri, Jun 27, 2025 at 02:38:19PM +0200, Jocelyn Falempe wrote:
> On 32bits ARM, u64 divided by a constant is not optimized to a
> multiply by inverse by the compiler [1].
> So do the multiply by inverse explicitly for this architecture.
> 
> Link: https://github.com/llvm/llvm-project/issues/37280 [1]
> Reported-by: Andrei Lalaev <andrey.lalaev@gmail.com>
> Closes: https://lore.kernel.org/dri-devel/c0a2771c-f3f5-4d4c-aa82-d673b3c5cb46@gmail.com/
> Fixes: 675008f196ca ("drm/panic: Use a decimal fifo to avoid u64 by u64 divide")
> Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
> ---
>  drivers/gpu/drm/drm_panic_qr.rs | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_panic_qr.rs b/drivers/gpu/drm/drm_panic_qr.rs
> index dd55b1cb764d..774a17de4f2f 100644
> --- a/drivers/gpu/drm/drm_panic_qr.rs
> +++ b/drivers/gpu/drm/drm_panic_qr.rs
> @@ -381,6 +381,26 @@ struct DecFifo {
>      len: usize,
>  }
>  
> +// On arm32 architecture, dividing an `u64` by a constant will generate a call
> +// to `__aeabi_uldivmod` which is not present in the kernel.
> +// So use the multiply by inverse method for this architecture.

I think the problem here is the u64 by u64 division. u64 by u32 should work.
Unfortunately Rust doesn't seem to have a way to perform a mixed-type division.
We already have optimized C/ASM helpers for u64 divisions. For example
div_u64() does such an u64 by u32 division.
While it may be slower than the inverse multiplication, it is less code, easier
to understand and will work for all architectures automatically.

> +fn div10(val: u64) -> u64 {
> +    if cfg!(target_arch = "arm") {
> +        let val_h = val >> 32;
> +        let val_l = val & 0xFFFFFFFF;
> +        let b_h: u64 = 0x66666666;
> +        let b_l: u64 = 0x66666667;
> +
> +        let tmp1 = val_h * b_l + ((val_l * b_l) >> 32);
> +        let tmp2 = val_l * b_h + (tmp1 & 0xffffffff);
> +        let tmp3 = val_h * b_h + (tmp1 >> 32) + (tmp2 >> 32);
> +
> +        tmp3 >> 2
> +    } else {
> +        val / 10
> +    }
> +}
> +
>  impl DecFifo {
>      fn push(&mut self, data: u64, len: usize) {
>          let mut chunk = data;
> @@ -389,7 +409,7 @@ fn push(&mut self, data: u64, len: usize) {
>          }
>          for i in 0..len {
>              self.decimals[i] = (chunk % 10) as u8;
> -            chunk /= 10;
> +            chunk = div10(chunk);
>          }
>          self.len += len;
>      }
> 
> base-commit: 3529cb5ab16b4f1f8bbc31dc39a1076a94bd1e38
> -- 
> 2.49.0
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-08-01  9:29 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-27 12:38 [PATCH v2] drm/panic: Add a u64 divide by 10 for arm32 Jocelyn Falempe
2025-08-01  8:52 ` Jocelyn Falempe
2025-08-01  9:03 ` Alice Ryhl
2025-08-01  9:29 ` Thomas Weißschuh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).