Re: [PATCH v6 9/9] drm/amd/display: Mark dc_fixpt_from

Building the Linux kernel with Clang and LLVM
 help / color / mirror / Atom feed

* Re: [PATCH v6 9/9] drm/amd/display: Mark dc_fixpt_from_fraction() noinline
       [not found]         ` <4bace457-cc26-13a3-bc90-ad6ad12bc2ed@loongson.cn>
@ 2024-12-20 10:31           ` Peter Zijlstra
  2024-12-20 22:34             ` Nathan Chancellor
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2024-12-20 10:31 UTC (permalink / raw)
  To: Tiezhu Yang
  Cc: Alex Deucher, Josh Poimboeuf, Huacai Chen, loongarch, amd-gfx,
	linux-kernel, Nick Desaulniers, nathan, llvm

On Fri, Dec 20, 2024 at 01:02:18PM +0800, Tiezhu Yang wrote:

> 2. For x86
> 
> I tested with LLVM 19.1.6 and the latest mainline LLVM, the test result
> is same with LoongArch.

Debian's clang-19 is 19.1.5.

> (1) objdump info with LLVM release version 19.1.6:

Please always use -r, that's ever so much more readable.

> $ clang --version | head -1
> clang version 19.1.6
> 
> There is an jump instruction "jmp" at the end of dc_fixpt_recip(), it
> maybe jumps to a position and then steps to the return instruction, so
> there is no "falls through" objtool warning.
> 
> 0000000000000290 <dc_fixpt_recip>:
>      290:       f3 0f 1e fa             endbr64
>      294:       e8 00 00 00 00          call   299 <dc_fixpt_recip+0x9>
>      299:       48 85 ff                test   %rdi,%rdi
>      29c:       0f 84 a8 00 00 00       je     34a <dc_fixpt_recip+0xba>
>      2a2:       48 89 f9                mov    %rdi,%rcx
>      2a5:       48 f7 d9                neg    %rcx
>      2a8:       48 0f 48 cf             cmovs  %rdi,%rcx
>      2ac:       53                      push   %rbx
>      2ad:       48 b8 00 00 00 00 01    movabs $0x100000000,%rax
>      2b4:       00 00 00
>      2b7:       31 d2                   xor    %edx,%edx
>      2b9:       48 f7 f1                div    %rcx
>      2bc:       be e0 ff ff ff          mov    $0xffffffe0,%esi
>      2c1:       eb 1b                   jmp    2de <dc_fixpt_recip+0x4e>
>      2c3:       45 88 c8                mov    %r9b,%r8b
>      2c6:       4d 01 c0                add    %r8,%r8
>      2c9:       4d 8d 04 80             lea    (%r8,%rax,4),%r8
>      2cd:       48 29 da                sub    %rbx,%rdx
>      2d0:       45 88 da                mov    %r11b,%r10b
>      2d3:       4c 89 d0                mov    %r10,%rax
>      2d6:       4c 09 c0                or     %r8,%rax
>      2d9:       83 c6 02                add    $0x2,%esi
>      2dc:       74 31                   je     30f <dc_fixpt_recip+0x7f>
>      2de:       48 01 d2                add    %rdx,%rdx
>      2e1:       45 31 c0                xor    %r8d,%r8d
>      2e4:       41 ba 00 00 00 00       mov    $0x0,%r10d
>      2ea:       48 39 ca                cmp    %rcx,%rdx
>      2ed:       41 0f 93 c1             setae  %r9b
>      2f1:       72 03                   jb     2f6 <dc_fixpt_recip+0x66>
>      2f3:       49 89 ca                mov    %rcx,%r10
>      2f6:       4c 29 d2                sub    %r10,%rdx
>      2f9:       48 01 d2                add    %rdx,%rdx
>      2fc:       45 31 d2                xor    %r10d,%r10d
>      2ff:       48 89 cb                mov    %rcx,%rbx
>      302:       48 39 ca                cmp    %rcx,%rdx
>      305:       41 0f 93 c3             setae  %r11b
>      309:       73 b8                   jae    2c3 <dc_fixpt_recip+0x33>
>      30b:       31 db                   xor    %ebx,%ebx
>      30d:       eb b4                   jmp    2c3 <dc_fixpt_recip+0x33>
>      30f:       48 01 d2                add    %rdx,%rdx
>      312:       48 be fe ff ff ff ff    movabs $0x7ffffffffffffffe,%rsi
>      319:       ff ff 7f
>      31c:       4c 8d 46 01             lea    0x1(%rsi),%r8
>      320:       48 39 ca                cmp    %rcx,%rdx
>      323:       4c 0f 43 c6             cmovae %rsi,%r8
>      327:       4c 39 c0                cmp    %r8,%rax
>      32a:       77 29                   ja     355 <dc_fixpt_recip+0xc5>
>      32c:       48 39 ca                cmp    %rcx,%rdx
>      32f:       48 83 d8 ff             sbb    $0xffffffffffffffff,%rax
>      333:       48 89 c1                mov    %rax,%rcx
>      336:       48 f7 d9                neg    %rcx
>      339:       48 85 ff                test   %rdi,%rdi
>      33c:       48 0f 49 c8             cmovns %rax,%rcx
>      340:       48 89 c8                mov    %rcx,%rax
>      343:       5b                      pop    %rbx
>      344:       2e e9 00 00 00 00       cs jmp 34a <dc_fixpt_recip+0xba>
>      34a:       0f 0b                   ud2
>      34c:       0f 0b                   ud2
>      34e:       31 c9                   xor    %ecx,%ecx
>      350:       e9 57 ff ff ff          jmp    2ac <dc_fixpt_recip+0x1c>
>      355:       0f 0b                   ud2
>      357:       eb d3                   jmp    32c <dc_fixpt_recip+0x9c>
>      359:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)

I had to puzzle a bit to get that double ud2 -- not all configs do that.

Also, curse the DRM Makefiles, you can't do:

  make drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.s

:-(

They are the two ASSERT()s on line 217 and 54 respectively. They end up
asserting the same value twice, so that makes sense.

> $ clang --version | head -1
> clang version 20.0.0git (https://github.com/llvm/llvm-project.git
> 8daf4f16fa08b5d876e98108721dd1743a360326)

So I didn't have a recent build at hand.. so I've not validated the
below.

> There is "ud2" instruction at the end of dc_fixpt_recip(), its offset
> is "350", this "ud2" instruction is not dead end due to the offset "352"
> is in the relocation section '.rela.discard.reachable', that is to say,
> dc_fixpt_recip() doesn't end with a return instruction or an
> unconditional jump, so objtool determined that the function can fall
> through into the next function, thus there is "falls through" objtool
> warning.
> 
> 0000000000000290 <dc_fixpt_recip>:
>      290:       f3 0f 1e fa             endbr64
>      294:       e8 00 00 00 00          call   299 <dc_fixpt_recip+0x9>
>      299:       48 85 ff                test   %rdi,%rdi
>      29c:       0f 84 ac 00 00 00       je     34e <dc_fixpt_recip+0xbe>
>      2a2:       53                      push   %rbx
>      2a3:       48 89 f9                mov    %rdi,%rcx
>      2a6:       48 f7 d9                neg    %rcx
>      2a9:       48 0f 48 cf             cmovs  %rdi,%rcx
>      2ad:       48 b8 00 00 00 00 01    movabs $0x100000000,%rax
>      2b4:       00 00 00
>      2b7:       31 d2                   xor    %edx,%edx
>      2b9:       48 f7 f1                div    %rcx
>      2bc:       be e0 ff ff ff          mov    $0xffffffe0,%esi
>      2c1:       eb 1b                   jmp    2de <dc_fixpt_recip+0x4e>
>      2c3:       45 88 c8                mov    %r9b,%r8b
>      2c6:       4d 01 c0                add    %r8,%r8
>      2c9:       4d 8d 04 80             lea    (%r8,%rax,4),%r8
>      2cd:       48 29 da                sub    %rbx,%rdx
>      2d0:       45 88 da                mov    %r11b,%r10b
>      2d3:       4c 89 d0                mov    %r10,%rax
>      2d6:       4c 09 c0                or     %r8,%rax
>      2d9:       83 c6 02                add    $0x2,%esi
>      2dc:       74 31                   je     30f <dc_fixpt_recip+0x7f>
>      2de:       48 01 d2                add    %rdx,%rdx
>      2e1:       45 31 c0                xor    %r8d,%r8d
>      2e4:       41 ba 00 00 00 00       mov    $0x0,%r10d
>      2ea:       48 39 ca                cmp    %rcx,%rdx
>      2ed:       41 0f 93 c1             setae  %r9b
>      2f1:       72 03                   jb     2f6 <dc_fixpt_recip+0x66>
>      2f3:       49 89 ca                mov    %rcx,%r10
>      2f6:       4c 29 d2                sub    %r10,%rdx
>      2f9:       48 01 d2                add    %rdx,%rdx
>      2fc:       45 31 d2                xor    %r10d,%r10d
>      2ff:       48 89 cb                mov    %rcx,%rbx
>      302:       48 39 ca                cmp    %rcx,%rdx
>      305:       41 0f 93 c3             setae  %r11b
>      309:       73 b8                   jae    2c3 <dc_fixpt_recip+0x33>
>      30b:       31 db                   xor    %ebx,%ebx
>      30d:       eb b4                   jmp    2c3 <dc_fixpt_recip+0x33>
>      30f:       48 01 d2                add    %rdx,%rdx
>      312:       48 be fe ff ff ff ff    movabs $0x7ffffffffffffffe,%rsi
>      319:       ff ff 7f
>      31c:       4c 8d 46 01             lea    0x1(%rsi),%r8
>      320:       48 39 ca                cmp    %rcx,%rdx
>      323:       4c 0f 43 c6             cmovae %rsi,%r8
>      327:       4c 39 c0                cmp    %r8,%rax
>      32a:       77 1e                   ja     34a <dc_fixpt_recip+0xba>
>      32c:       48 39 ca                cmp    %rcx,%rdx
>      32f:       48 83 d8 ff             sbb    $0xffffffffffffffff,%rax
>      333:       48 89 c1                mov    %rax,%rcx
>      336:       48 f7 d9                neg    %rcx
>      339:       48 85 ff                test   %rdi,%rdi
>      33c:       48 0f 49 c8             cmovns %rax,%rcx
>      340:       48 89 c8                mov    %rcx,%rax
>      343:       5b                      pop    %rbx
>      344:       2e e9 00 00 00 00       cs jmp 34a <dc_fixpt_recip+0xba>
>      34a:       0f 0b                   ud2
>      34c:       eb de                   jmp    32c <dc_fixpt_recip+0x9c>
>      34e:       0f 0b                   ud2
>      350:       0f 0b                   ud2
>      352:       66 66 66 66 66 2e 0f    data16 data16 data16 data16 cs nopw
> 0x0(%rax,%rax,1)
>      359:       1f 84 00 00 00 00 00

If you put them size-by-side, you'll see it's more or less the same
code-gen (trivial differences), but now it just stops code-gen, where
previously it would continue.

So this really is a compiler problem, this needs no annotation, it's
straight up broken.

Now, the thing is, these ASSERT()s are checking for divide-by-zero, I
suspect clang figured that out and invokes UB on us and just stops
code-gen.

Nathan, Nick, don't we have a compiler flag that forces __builtin_trap()
whenever clang pulls something like this? I think UBSAN does this, but
we really shouldn't pull in the whole of that for sanity.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6 9/9] drm/amd/display: Mark dc_fixpt_from_fraction() noinline
  2024-12-20 10:31           ` [PATCH v6 9/9] drm/amd/display: Mark dc_fixpt_from_fraction() noinline Peter Zijlstra
@ 2024-12-20 22:34             ` Nathan Chancellor
  2024-12-21  7:40               ` Xi Ruoyao
  0 siblings, 1 reply; 6+ messages in thread
From: Nathan Chancellor @ 2024-12-20 22:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Tiezhu Yang, Alex Deucher, Josh Poimboeuf, Huacai Chen, loongarch,
	amd-gfx, linux-kernel, Nick Desaulniers, llvm

On Fri, Dec 20, 2024 at 11:31:00AM +0100, Peter Zijlstra wrote:
> Also, curse the DRM Makefiles, you can't do:
> 
>   make drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.s

Small tip: You can get the path of the target by building
drivers/gpu/drm/amd/amdgpu/ and finding it in the output. In this case,
it'd be

  $ make drivers/gpu/drm/amd/amdgpu/../display/dc/basics/fixpt31_32.s

Not excusing that it does not work as it should but sometimes you have
to work with what you can *shrug*

> > $ clang --version | head -1
> > clang version 20.0.0git (https://github.com/llvm/llvm-project.git
> > 8daf4f16fa08b5d876e98108721dd1743a360326)
> 
> So I didn't have a recent build at hand.. so I've not validated the
> below.
...
> If you put them size-by-side, you'll see it's more or less the same
> code-gen (trivial differences), but now it just stops code-gen, where
> previously it would continue.
> 
> So this really is a compiler problem, this needs no annotation, it's
> straight up broken.
> 
> Now, the thing is, these ASSERT()s are checking for divide-by-zero, I
> suspect clang figured that out and invokes UB on us and just stops
> code-gen.

Yeah, I think your analysis is spot on, as this was introduced by a
change in clang from a few months ago according to my bisect:

https://github.com/llvm/llvm-project/commit/37932643abab699e8bb1def08b7eb4eae7ff1448

Since the ASSERT does not do anything to prevent the divide by zero (it
just flags it with WARN_ON) and the rest of the code doesn't either, I
assume that the codegen stops as soon as it encounters the unreachable
that change created from the path where divide by zero would occur via

  dc_fixpt_recip() ->
    dc_fixpt_from_fraction() ->
      complete_integer_division_u64() ->
        div64_u64_rem()

Shouldn't callers of division functions harden them against dividing by
zero?

> Nathan, Nick, don't we have a compiler flag that forces __builtin_trap()
> whenever clang pulls something like this? I think UBSAN does this, but
> we really shouldn't pull in the whole of that for sanity.

Right, I think that LLVM has a hidden flag for this:

  -mllvm -trap-unreachable

That makes this particular warning disappear.

It isn't the greatest because '-mllvm' flags need to be passed along to
the linker for LTO but that's easy enough to deal with. I know we have
talked about enabling that flag in the past but I cannot remember why we
decided against it (maybe code size concerns and other optimization
restrictions)? It looks like GCC has a similar flag,
-funreachable-traps.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6 9/9] drm/amd/display: Mark dc_fixpt_from_fraction() noinline
  2024-12-20 22:34             ` Nathan Chancellor
@ 2024-12-21  7:40               ` Xi Ruoyao
  2024-12-22  4:27                 ` Tiezhu Yang
  0 siblings, 1 reply; 6+ messages in thread
From: Xi Ruoyao @ 2024-12-21  7:40 UTC (permalink / raw)
  To: Nathan Chancellor, Peter Zijlstra
  Cc: Tiezhu Yang, Alex Deucher, Josh Poimboeuf, Huacai Chen, loongarch,
	amd-gfx, linux-kernel, Nick Desaulniers, llvm

On Fri, 2024-12-20 at 15:34 -0700, Nathan Chancellor wrote:
> > Now, the thing is, these ASSERT()s are checking for divide-by-zero, I
> > suspect clang figured that out and invokes UB on us and just stops
> > code-gen.
> 
> Yeah, I think your analysis is spot on, as this was introduced by a
> change in clang from a few months ago according to my bisect:
> 
> https://github.com/llvm/llvm-project/commit/37932643abab699e8bb1def08b7eb4eae7ff1448
> 
> Since the ASSERT does not do anything to prevent the divide by zero (it
> just flags it with WARN_ON) and the rest of the code doesn't either, I
> assume that the codegen stops as soon as it encounters the unreachable
> that change created from the path where divide by zero would occur via
> 
>   dc_fixpt_recip() ->
>     dc_fixpt_from_fraction() ->
>       complete_integer_division_u64() ->
>         div64_u64_rem()
> 
> Shouldn't callers of division functions harden them against dividing by
> zero?

Yes I think it'd be the correct solution.

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6 9/9] drm/amd/display: Mark dc_fixpt_from_fraction() noinline
  2024-12-21  7:40               ` Xi Ruoyao
@ 2024-12-22  4:27                 ` Tiezhu Yang
  2024-12-23 21:46                   ` Nathan Chancellor
  0 siblings, 1 reply; 6+ messages in thread
From: Tiezhu Yang @ 2024-12-22  4:27 UTC (permalink / raw)
  To: Xi Ruoyao, Nathan Chancellor, Peter Zijlstra
  Cc: Alex Deucher, Josh Poimboeuf, Huacai Chen, loongarch, amd-gfx,
	linux-kernel, Nick Desaulniers, llvm

On 12/21/2024 03:40 PM, Xi Ruoyao wrote:
> On Fri, 2024-12-20 at 15:34 -0700, Nathan Chancellor wrote:
>>> Now, the thing is, these ASSERT()s are checking for divide-by-zero, I
>>> suspect clang figured that out and invokes UB on us and just stops
>>> code-gen.
>>
>> Yeah, I think your analysis is spot on, as this was introduced by a
>> change in clang from a few months ago according to my bisect:
>>
>> https://github.com/llvm/llvm-project/commit/37932643abab699e8bb1def08b7eb4eae7ff1448
>>
>> Since the ASSERT does not do anything to prevent the divide by zero (it
>> just flags it with WARN_ON) and the rest of the code doesn't either, I
>> assume that the codegen stops as soon as it encounters the unreachable
>> that change created from the path where divide by zero would occur via
>>
>>   dc_fixpt_recip() ->
>>     dc_fixpt_from_fraction() ->
>>       complete_integer_division_u64() ->
>>         div64_u64_rem()
>>
>> Shouldn't callers of division functions harden them against dividing by
>> zero?
>
> Yes I think it'd be the correct solution.

Thank you all. Do you mean like this?

--- >8 ---

diff --git a/drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c 
b/drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c
index 88d3f9d7dd55..848d8e67304a 100644
--- a/drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c
+++ b/drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c
@@ -79,11 +79,13 @@ struct fixed31_32 dc_fixpt_from_fraction(long long 
numerator, long long denomina
         unsigned long long arg2_value = arg2_negative ? -denominator : 
denominator;

         unsigned long long remainder;
+       unsigned long long res_value;

         /* determine integer part */

-       unsigned long long res_value = complete_integer_division_u64(
-               arg1_value, arg2_value, &remainder);
+       ASSERT(arg2_value);
+
+       res_value = complete_integer_division_u64(arg1_value, 
arg2_value, &remainder);

         ASSERT(res_value <= LONG_MAX);

@@ -214,8 +216,6 @@ struct fixed31_32 dc_fixpt_recip(struct fixed31_32 arg)
          * Good idea to use Newton's method
          */

-       ASSERT(arg.value);
-
         return dc_fixpt_from_fraction(
                 dc_fixpt_one.value,
                 arg.value);

With the above changes, there is no "falls through" objtool warning
compiled with both clang 19 and the latest mainline clang 20.

If you are OK with it, I will send a separate formal patch to handle
this issue after doing some more testing.

Thanks,
Tiezhu


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v6 9/9] drm/amd/display: Mark dc_fixpt_from_fraction() noinline
  2024-12-22  4:27                 ` Tiezhu Yang
@ 2024-12-23 21:46                   ` Nathan Chancellor
  2024-12-25  9:43                     ` Tiezhu Yang
  0 siblings, 1 reply; 6+ messages in thread
From: Nathan Chancellor @ 2024-12-23 21:46 UTC (permalink / raw)
  To: Tiezhu Yang
  Cc: Xi Ruoyao, Peter Zijlstra, Alex Deucher, Josh Poimboeuf,
	Huacai Chen, loongarch, amd-gfx, linux-kernel, Nick Desaulniers,
	llvm

On Sun, Dec 22, 2024 at 12:27:47PM +0800, Tiezhu Yang wrote:
> On 12/21/2024 03:40 PM, Xi Ruoyao wrote:
> > On Fri, 2024-12-20 at 15:34 -0700, Nathan Chancellor wrote:
> > > > Now, the thing is, these ASSERT()s are checking for divide-by-zero, I
> > > > suspect clang figured that out and invokes UB on us and just stops
> > > > code-gen.
> > > 
> > > Yeah, I think your analysis is spot on, as this was introduced by a
> > > change in clang from a few months ago according to my bisect:
> > > 
> > > https://github.com/llvm/llvm-project/commit/37932643abab699e8bb1def08b7eb4eae7ff1448
> > > 
> > > Since the ASSERT does not do anything to prevent the divide by zero (it
> > > just flags it with WARN_ON) and the rest of the code doesn't either, I
> > > assume that the codegen stops as soon as it encounters the unreachable
> > > that change created from the path where divide by zero would occur via
> > > 
> > >   dc_fixpt_recip() ->
> > >     dc_fixpt_from_fraction() ->
> > >       complete_integer_division_u64() ->
> > >         div64_u64_rem()
> > > 
> > > Shouldn't callers of division functions harden them against dividing by
> > > zero?
> > 
> > Yes I think it'd be the correct solution.
> 
> Thank you all. Do you mean like this?
> 
> --- >8 ---
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c
> b/drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c
> index 88d3f9d7dd55..848d8e67304a 100644
> --- a/drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c
> +++ b/drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c
> @@ -79,11 +79,13 @@ struct fixed31_32 dc_fixpt_from_fraction(long long
> numerator, long long denomina
>         unsigned long long arg2_value = arg2_negative ? -denominator :
> denominator;
> 
>         unsigned long long remainder;
> +       unsigned long long res_value;
> 
>         /* determine integer part */
> 
> -       unsigned long long res_value = complete_integer_division_u64(
> -               arg1_value, arg2_value, &remainder);
> +       ASSERT(arg2_value);
> +
> +       res_value = complete_integer_division_u64(arg1_value, arg2_value,
> &remainder);
> 
>         ASSERT(res_value <= LONG_MAX);
> 
> @@ -214,8 +216,6 @@ struct fixed31_32 dc_fixpt_recip(struct fixed31_32 arg)
>          * Good idea to use Newton's method
>          */
> 
> -       ASSERT(arg.value);
> -
>         return dc_fixpt_from_fraction(
>                 dc_fixpt_one.value,
>                 arg.value);
> 
> With the above changes, there is no "falls through" objtool warning
> compiled with both clang 19 and the latest mainline clang 20.

I am somewhat surprised that changes anything because the ASSERT is not
stopping control flow so I would expect the same problem as before. I
guess it does not happen perhaps due to inlining differences? I looked
at this code briefly when I sent my initial message and I was not sure
where such a check should exist. It does not look like these functions
really do any sort of error handling.

> If you are OK with it, I will send a separate formal patch to handle
> this issue after doing some more testing.

It may still be worth doing this to get some initial thoughts from the
AMD DRM folks.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v6 9/9] drm/amd/display: Mark dc_fixpt_from_fraction() noinline
  2024-12-23 21:46                   ` Nathan Chancellor
@ 2024-12-25  9:43                     ` Tiezhu Yang
  0 siblings, 0 replies; 6+ messages in thread
From: Tiezhu Yang @ 2024-12-25  9:43 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Xi Ruoyao, Peter Zijlstra, Alex Deucher, Josh Poimboeuf,
	Huacai Chen, loongarch, amd-gfx, linux-kernel, Nick Desaulniers,
	llvm

On 12/24/2024 05:46 AM, Nathan Chancellor wrote:
> On Sun, Dec 22, 2024 at 12:27:47PM +0800, Tiezhu Yang wrote:

...

>> With the above changes, there is no "falls through" objtool warning
>> compiled with both clang 19 and the latest mainline clang 20.
>
> I am somewhat surprised that changes anything because the ASSERT is not
> stopping control flow so I would expect the same problem as before. I
> guess it does not happen perhaps due to inlining differences? I looked

It is weird and I think it is not the correct way.

> at this code briefly when I sent my initial message and I was not sure
> where such a check should exist. It does not look like these functions
> really do any sort of error handling.
>
>> If you are OK with it, I will send a separate formal patch to handle
>> this issue after doing some more testing.
>
> It may still be worth doing this to get some initial thoughts from the
> AMD DRM folks.

I think the correct way is:

Keep the current ASSERT for the aim of debugging, just add BUG() to
stop control flow if the divisor is zero.

--- >8 ---

diff --git a/drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c 
b/drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c
index 88d3f9d7dd55..e15391e36b40 100644
--- a/drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c
+++ b/drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c
@@ -52,6 +52,7 @@ static inline unsigned long long 
complete_integer_division_u64(
         unsigned long long result;

         ASSERT(divisor);
+       BUG_ON(!divisor);

         result = div64_u64_rem(dividend, divisor, remainder);

diff --git a/drivers/gpu/drm/amd/display/dc/spl/spl_fixpt31_32.c 
b/drivers/gpu/drm/amd/display/dc/spl/spl_fixpt31_32.c
index 131f1e3949d3..ce2036950808 100644
--- a/drivers/gpu/drm/amd/display/dc/spl/spl_fixpt31_32.c
+++ b/drivers/gpu/drm/amd/display/dc/spl/spl_fixpt31_32.c
@@ -30,6 +30,7 @@ static inline unsigned long long 
spl_complete_integer_division_u64(
         unsigned long long result;

         SPL_ASSERT(divisor);
+       BUG_ON(!divisor);

         result = spl_div64_u64_rem(dividend, divisor, remainder);

It looks reasonable and works well both on x86 and LoongArch, there are
no the following objtool warnings:

   dc_fixpt_recip() falls through to next function dc_fixpt_sinc()
   spl_fixpt_recip() falls through to next function spl_fixpt_sinc()

If no more comments, I will send a separate formal patch for your
review in the next week.

Thanks,
Tiezhu


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-12-25  9:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20241217010905.13054-1-yangtiezhu@loongson.cn>
     [not found] ` <20241217015006.30305-1-yangtiezhu@loongson.cn>
     [not found]   ` <CAAhV-H4WpcWjrQdkm3Sx4DVbp=2oW9nNGAbNqR4BEmiryrptNQ@mail.gmail.com>
     [not found]     ` <20241218190558.g2hykmjgbny4n5i3@jpoimboe>
     [not found]       ` <CADnq5_PFcHzob4poLa86_K4yP5gUT7Ei4Rz5vSUofvZTmB48-g@mail.gmail.com>
     [not found]         ` <4bace457-cc26-13a3-bc90-ad6ad12bc2ed@loongson.cn>
2024-12-20 10:31           ` [PATCH v6 9/9] drm/amd/display: Mark dc_fixpt_from_fraction() noinline Peter Zijlstra
2024-12-20 22:34             ` Nathan Chancellor
2024-12-21  7:40               ` Xi Ruoyao
2024-12-22  4:27                 ` Tiezhu Yang
2024-12-23 21:46                   ` Nathan Chancellor
2024-12-25  9:43                     ` Tiezhu Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox