rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/panic: Add a u64 divide by 10 for arm32
@ 2025-06-27  9:40 Jocelyn Falempe
  2025-06-27 11:36 ` Alice Ryhl
  2025-06-27 11:44 ` Miguel Ojeda
  0 siblings, 2 replies; 6+ messages in thread
From: Jocelyn Falempe @ 2025-06-27  9:40 UTC (permalink / raw)
  To: Andrei Lalaev, Miguel Ojeda, Christian Schrefl, Arnd Bergmann,
	Russell King, Paolo Bonzini, rust-for-linux, Linux ARM,
	Thomas Zimmermann, Javier Martinez Canillas, Maarten Lankhorst,
	Maxime Ripard, David Airlie, Simona Vetter, dri-devel
  Cc: Jocelyn Falempe

On 32bits ARM, u64 divided by a constant is not optimized to a
multiply by inverse by the compiler [1].
So do the multiply by inverse explicitly for this architecture.

Link: https://github.com/llvm/llvm-project/issues/37280 [1]
Reported-by: Andrei Lalaev <andrey.lalaev@gmail.com>
Closes: https://lore.kernel.org/dri-devel/c0a2771c-f3f5-4d4c-aa82-d673b3c5cb46@gmail.com/
Fixes: 675008f196ca ("drm/panic: Use a decimal fifo to avoid u64 by u64 divide")
Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
---
 drivers/gpu/drm/drm_panic_qr.rs | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_panic_qr.rs b/drivers/gpu/drm/drm_panic_qr.rs
index dd55b1cb764d..82acecd505d3 100644
--- a/drivers/gpu/drm/drm_panic_qr.rs
+++ b/drivers/gpu/drm/drm_panic_qr.rs
@@ -381,6 +381,24 @@ struct DecFifo {
     len: usize,
 }
 
+/// On arm32 architecture, dividing an u64 by a constant will generate a call
+/// to __aeabi_uldivmod which is not present in the kernel.
+/// So use the multiply by inverse method for this architecture.
+#[cfg(target_arch = "arm")]
+fn div10(val: u64) -> u64
+{
+    let val_h = val >> 32;
+    let val_l = val & 0xFFFFFFFF;
+    let b_h: u64 = 0x66666666;
+    let b_l: u64 = 0x66666667;
+
+    let tmp1 = val_h * b_l + ((val_l * b_l) >> 32);
+    let tmp2 = val_l * b_h + (tmp1 & 0xffffffff);
+    let tmp3 = val_h * b_h + (tmp1 >> 32) + (tmp2 >> 32);
+
+    tmp3 >> 2
+}
+
 impl DecFifo {
     fn push(&mut self, data: u64, len: usize) {
         let mut chunk = data;
@@ -389,7 +407,11 @@ fn push(&mut self, data: u64, len: usize) {
         }
         for i in 0..len {
             self.decimals[i] = (chunk % 10) as u8;
-            chunk /= 10;
+            if cfg!(target_arch = "arm") {
+                chunk = div10(chunk);
+            } else {
+                chunk /= 10;
+            }
         }
         self.len += len;
     }

base-commit: 3529cb5ab16b4f1f8bbc31dc39a1076a94bd1e38
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/panic: Add a u64 divide by 10 for arm32
  2025-06-27  9:40 [PATCH] drm/panic: Add a u64 divide by 10 for arm32 Jocelyn Falempe
@ 2025-06-27 11:36 ` Alice Ryhl
  2025-06-27 11:41   ` Miguel Ojeda
  2025-06-27 11:47   ` Jocelyn Falempe
  2025-06-27 11:44 ` Miguel Ojeda
  1 sibling, 2 replies; 6+ messages in thread
From: Alice Ryhl @ 2025-06-27 11:36 UTC (permalink / raw)
  To: Jocelyn Falempe
  Cc: Andrei Lalaev, Miguel Ojeda, Christian Schrefl, Arnd Bergmann,
	Russell King, Paolo Bonzini, rust-for-linux, Linux ARM,
	Thomas Zimmermann, Javier Martinez Canillas, Maarten Lankhorst,
	Maxime Ripard, David Airlie, Simona Vetter, dri-devel

On Fri, Jun 27, 2025 at 11:41 AM Jocelyn Falempe <jfalempe@redhat.com> wrote:
>
> On 32bits ARM, u64 divided by a constant is not optimized to a
> multiply by inverse by the compiler [1].
> So do the multiply by inverse explicitly for this architecture.
>
> Link: https://github.com/llvm/llvm-project/issues/37280 [1]
> Reported-by: Andrei Lalaev <andrey.lalaev@gmail.com>
> Closes: https://lore.kernel.org/dri-devel/c0a2771c-f3f5-4d4c-aa82-d673b3c5cb46@gmail.com/
> Fixes: 675008f196ca ("drm/panic: Use a decimal fifo to avoid u64 by u64 divide")
> Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>

Not to block this change, but I think this really ought to be fixed in
the compiler. We should not have to do this kind of thing to divide by
10.

>  drivers/gpu/drm/drm_panic_qr.rs | 24 +++++++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_panic_qr.rs b/drivers/gpu/drm/drm_panic_qr.rs
> index dd55b1cb764d..82acecd505d3 100644
> --- a/drivers/gpu/drm/drm_panic_qr.rs
> +++ b/drivers/gpu/drm/drm_panic_qr.rs
> @@ -381,6 +381,24 @@ struct DecFifo {
>      len: usize,
>  }
>
> +/// On arm32 architecture, dividing an u64 by a constant will generate a call
> +/// to __aeabi_uldivmod which is not present in the kernel.
> +/// So use the multiply by inverse method for this architecture.
> +#[cfg(target_arch = "arm")]
> +fn div10(val: u64) -> u64
> +{

Please run rustfmt on your patch.

> +    let val_h = val >> 32;
> +    let val_l = val & 0xFFFFFFFF;
> +    let b_h: u64 = 0x66666666;
> +    let b_l: u64 = 0x66666667;
> +
> +    let tmp1 = val_h * b_l + ((val_l * b_l) >> 32);
> +    let tmp2 = val_l * b_h + (tmp1 & 0xffffffff);
> +    let tmp3 = val_h * b_h + (tmp1 >> 32) + (tmp2 >> 32);
> +
> +    tmp3 >> 2
> +}
> +
>  impl DecFifo {
>      fn push(&mut self, data: u64, len: usize) {
>          let mut chunk = data;
> @@ -389,7 +407,11 @@ fn push(&mut self, data: u64, len: usize) {
>          }
>          for i in 0..len {
>              self.decimals[i] = (chunk % 10) as u8;
> -            chunk /= 10;
> +            if cfg!(target_arch = "arm") {
> +                chunk = div10(chunk);
> +            } else {
> +                chunk /= 10;
> +            }

I would get rid of this conditional and declare another div10 function
that just does input/10 on other arches.

Alice

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/panic: Add a u64 divide by 10 for arm32
  2025-06-27 11:36 ` Alice Ryhl
@ 2025-06-27 11:41   ` Miguel Ojeda
  2025-06-27 11:47   ` Jocelyn Falempe
  1 sibling, 0 replies; 6+ messages in thread
From: Miguel Ojeda @ 2025-06-27 11:41 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Jocelyn Falempe, Andrei Lalaev, Miguel Ojeda, Christian Schrefl,
	Arnd Bergmann, Russell King, Paolo Bonzini, rust-for-linux,
	Linux ARM, Thomas Zimmermann, Javier Martinez Canillas,
	Maarten Lankhorst, Maxime Ripard, David Airlie, Simona Vetter,
	dri-devel

On Fri, Jun 27, 2025 at 1:37 PM Alice Ryhl <aliceryhl@google.com> wrote:
>
> I would get rid of this conditional and declare another div10 function
> that just does input/10 on other arches.

Yeah, please keep `cfg`s as local as possible, i.e. inside the
implementation of the function where possible, so that we share even
the signature etc., e.g.

    https://lore.kernel.org/rust-for-linux/20250625051518.15255-6-boqun.feng@gmail.com/

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/panic: Add a u64 divide by 10 for arm32
  2025-06-27  9:40 [PATCH] drm/panic: Add a u64 divide by 10 for arm32 Jocelyn Falempe
  2025-06-27 11:36 ` Alice Ryhl
@ 2025-06-27 11:44 ` Miguel Ojeda
  2025-06-27 11:52   ` Jocelyn Falempe
  1 sibling, 1 reply; 6+ messages in thread
From: Miguel Ojeda @ 2025-06-27 11:44 UTC (permalink / raw)
  To: Jocelyn Falempe
  Cc: Andrei Lalaev, Miguel Ojeda, Christian Schrefl, Arnd Bergmann,
	Russell King, Paolo Bonzini, rust-for-linux, Linux ARM,
	Thomas Zimmermann, Javier Martinez Canillas, Maarten Lankhorst,
	Maxime Ripard, David Airlie, Simona Vetter, dri-devel

On Fri, Jun 27, 2025 at 11:41 AM Jocelyn Falempe <jfalempe@redhat.com> wrote:
>
> +/// On arm32 architecture, dividing an u64 by a constant will generate a call
> +/// to __aeabi_uldivmod which is not present in the kernel.
> +/// So use the multiply by inverse method for this architecture.

This sounds more like a normal comment instead of function docs, no?

By the way, formatting:

    `u64`
    `__aeabi_uldivmod`

Thanks for fixing this! It is nice seeing 32-bit arm taking shape.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/panic: Add a u64 divide by 10 for arm32
  2025-06-27 11:36 ` Alice Ryhl
  2025-06-27 11:41   ` Miguel Ojeda
@ 2025-06-27 11:47   ` Jocelyn Falempe
  1 sibling, 0 replies; 6+ messages in thread
From: Jocelyn Falempe @ 2025-06-27 11:47 UTC (permalink / raw)
  To: Alice Ryhl
  Cc: Andrei Lalaev, Miguel Ojeda, Christian Schrefl, Arnd Bergmann,
	Russell King, Paolo Bonzini, rust-for-linux, Linux ARM,
	Thomas Zimmermann, Javier Martinez Canillas, Maarten Lankhorst,
	Maxime Ripard, David Airlie, Simona Vetter, dri-devel

On 27/06/2025 13:36, Alice Ryhl wrote:
> On Fri, Jun 27, 2025 at 11:41 AM Jocelyn Falempe <jfalempe@redhat.com> wrote:
>>
>> On 32bits ARM, u64 divided by a constant is not optimized to a
>> multiply by inverse by the compiler [1].
>> So do the multiply by inverse explicitly for this architecture.
>>
>> Link: https://github.com/llvm/llvm-project/issues/37280 [1]
>> Reported-by: Andrei Lalaev <andrey.lalaev@gmail.com>
>> Closes: https://lore.kernel.org/dri-devel/c0a2771c-f3f5-4d4c-aa82-d673b3c5cb46@gmail.com/
>> Fixes: 675008f196ca ("drm/panic: Use a decimal fifo to avoid u64 by u64 divide")
>> Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
> 
> Not to block this change, but I think this really ought to be fixed in
> the compiler. We should not have to do this kind of thing to divide by
> 10.

I agree, I didn't expect that would be a problem. But I'm not a compiler 
expert, and it will probably take time to update the compiler, so we 
have to do this at least temporary.
> 
>>   drivers/gpu/drm/drm_panic_qr.rs | 24 +++++++++++++++++++++++-
>>   1 file changed, 23 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/drm_panic_qr.rs b/drivers/gpu/drm/drm_panic_qr.rs
>> index dd55b1cb764d..82acecd505d3 100644
>> --- a/drivers/gpu/drm/drm_panic_qr.rs
>> +++ b/drivers/gpu/drm/drm_panic_qr.rs
>> @@ -381,6 +381,24 @@ struct DecFifo {
>>       len: usize,
>>   }
>>
>> +/// On arm32 architecture, dividing an u64 by a constant will generate a call
>> +/// to __aeabi_uldivmod which is not present in the kernel.
>> +/// So use the multiply by inverse method for this architecture.
>> +#[cfg(target_arch = "arm")]
>> +fn div10(val: u64) -> u64
>> +{
> 
> Please run rustfmt on your patch.

sorry, I will fix that.
> 
>> +    let val_h = val >> 32;
>> +    let val_l = val & 0xFFFFFFFF;
>> +    let b_h: u64 = 0x66666666;
>> +    let b_l: u64 = 0x66666667;
>> +
>> +    let tmp1 = val_h * b_l + ((val_l * b_l) >> 32);
>> +    let tmp2 = val_l * b_h + (tmp1 & 0xffffffff);
>> +    let tmp3 = val_h * b_h + (tmp1 >> 32) + (tmp2 >> 32);
>> +
>> +    tmp3 >> 2
>> +}
>> +
>>   impl DecFifo {
>>       fn push(&mut self, data: u64, len: usize) {
>>           let mut chunk = data;
>> @@ -389,7 +407,11 @@ fn push(&mut self, data: u64, len: usize) {
>>           }
>>           for i in 0..len {
>>               self.decimals[i] = (chunk % 10) as u8;
>> -            chunk /= 10;
>> +            if cfg!(target_arch = "arm") {
>> +                chunk = div10(chunk);
>> +            } else {
>> +                chunk /= 10;
>> +            }
> 
> I would get rid of this conditional and declare another div10 function
> that just does input/10 on other arches.

ok, I will send a v2 shortly with that changed.
> 
> Alice
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/panic: Add a u64 divide by 10 for arm32
  2025-06-27 11:44 ` Miguel Ojeda
@ 2025-06-27 11:52   ` Jocelyn Falempe
  0 siblings, 0 replies; 6+ messages in thread
From: Jocelyn Falempe @ 2025-06-27 11:52 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Andrei Lalaev, Miguel Ojeda, Christian Schrefl, Arnd Bergmann,
	Russell King, Paolo Bonzini, rust-for-linux, Linux ARM,
	Thomas Zimmermann, Javier Martinez Canillas, Maarten Lankhorst,
	Maxime Ripard, David Airlie, Simona Vetter, dri-devel

On 27/06/2025 13:44, Miguel Ojeda wrote:
> On Fri, Jun 27, 2025 at 11:41 AM Jocelyn Falempe <jfalempe@redhat.com> wrote:
>>
>> +/// On arm32 architecture, dividing an u64 by a constant will generate a call
>> +/// to __aeabi_uldivmod which is not present in the kernel.
>> +/// So use the multiply by inverse method for this architecture.
> 
> This sounds more like a normal comment instead of function docs, no?

Yes, I think I'm still confused between // and ///, so I will replace 
with //.
> 
> By the way, formatting:
> 
>      `u64`
>      `__aeabi_uldivmod`

ok, I will do that in v2.

> 
> Thanks for fixing this! It is nice seeing 32-bit arm taking shape.

Thank you for your reactivity

> 
> Cheers,
> Miguel
> 

-- 

Jocelyn


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-06-27 11:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-27  9:40 [PATCH] drm/panic: Add a u64 divide by 10 for arm32 Jocelyn Falempe
2025-06-27 11:36 ` Alice Ryhl
2025-06-27 11:41   ` Miguel Ojeda
2025-06-27 11:47   ` Jocelyn Falempe
2025-06-27 11:44 ` Miguel Ojeda
2025-06-27 11:52   ` Jocelyn Falempe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).