* Kernel oops caused by signed divide @ 2024-09-09 17:21 Zac Ecob 2024-09-09 17:27 ` Yonghong Song 2024-09-09 17:29 ` Alexei Starovoitov 0 siblings, 2 replies; 16+ messages in thread From: Zac Ecob @ 2024-09-09 17:21 UTC (permalink / raw) To: bpf@vger.kernel.org Hello, I recently received a kernel 'oops' about a divide error. After some research, it seems that the 'div64_s64' function used for the 'MOD'/'REM' instructions boils down to an 'idiv'. The 'dividend' is set to INT64_MIN, and the 'divisor' to -1, then because of two's complement, there is no corresponding positive value, causing the error (at least to my understanding). Apologies if this is already known / not a relevant concern. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Kernel oops caused by signed divide 2024-09-09 17:21 Kernel oops caused by signed divide Zac Ecob @ 2024-09-09 17:27 ` Yonghong Song 2024-09-09 17:29 ` Alexei Starovoitov 1 sibling, 0 replies; 16+ messages in thread From: Yonghong Song @ 2024-09-09 17:27 UTC (permalink / raw) To: Zac Ecob, bpf@vger.kernel.org On 9/9/24 10:21 AM, Zac Ecob wrote: > Hello, > > I recently received a kernel 'oops' about a divide error. > After some research, it seems that the 'div64_s64' function used for the 'MOD'/'REM' instructions boils down to an 'idiv'. > > The 'dividend' is set to INT64_MIN, and the 'divisor' to -1, then because of two's complement, there is no corresponding positive value, causing the error (at least to my understanding). Could you provide a reproducible test case for this? It will make it easy to debug the issue. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Kernel oops caused by signed divide 2024-09-09 17:21 Kernel oops caused by signed divide Zac Ecob 2024-09-09 17:27 ` Yonghong Song @ 2024-09-09 17:29 ` Alexei Starovoitov 2024-09-09 23:47 ` Yonghong Song 2024-09-10 14:21 ` Yonghong Song 1 sibling, 2 replies; 16+ messages in thread From: Alexei Starovoitov @ 2024-09-09 17:29 UTC (permalink / raw) To: Zac Ecob, Yonghong Song, Daniel Borkmann; +Cc: bpf@vger.kernel.org On Mon, Sep 9, 2024 at 10:21 AM Zac Ecob <zacecob@protonmail.com> wrote: > > Hello, > > I recently received a kernel 'oops' about a divide error. > After some research, it seems that the 'div64_s64' function used for the 'MOD'/'REM' instructions boils down to an 'idiv'. > > The 'dividend' is set to INT64_MIN, and the 'divisor' to -1, then because of two's complement, there is no corresponding positive value, causing the error (at least to my understanding). > > > Apologies if this is already known / not a relevant concern. Thanks for the report. This is a new issue. Yonghong, it's related to the new signed div insn. It sounds like we need to update chk_and_div[] part of the verifier to account for signed div differently. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Kernel oops caused by signed divide 2024-09-09 17:29 ` Alexei Starovoitov @ 2024-09-09 23:47 ` Yonghong Song 2024-09-10 14:21 ` Yonghong Song 1 sibling, 0 replies; 16+ messages in thread From: Yonghong Song @ 2024-09-09 23:47 UTC (permalink / raw) To: Alexei Starovoitov, Zac Ecob, Daniel Borkmann; +Cc: bpf@vger.kernel.org On 9/9/24 10:29 AM, Alexei Starovoitov wrote: > On Mon, Sep 9, 2024 at 10:21 AM Zac Ecob <zacecob@protonmail.com> wrote: >> Hello, >> >> I recently received a kernel 'oops' about a divide error. >> After some research, it seems that the 'div64_s64' function used for the 'MOD'/'REM' instructions boils down to an 'idiv'. >> >> The 'dividend' is set to INT64_MIN, and the 'divisor' to -1, then because of two's complement, there is no corresponding positive value, causing the error (at least to my understanding). >> >> >> Apologies if this is already known / not a relevant concern. > Thanks for the report. This is a new issue. > > Yonghong, > > it's related to the new signed div insn. > It sounds like we need to update chk_and_div[] part of > the verifier to account for signed div differently. Okay. Indeed, INT64_MIN/(-1) cannot be represented. I will do something similar to chk_and_div[] to filter out this corner case. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Kernel oops caused by signed divide 2024-09-09 17:29 ` Alexei Starovoitov 2024-09-09 23:47 ` Yonghong Song @ 2024-09-10 14:21 ` Yonghong Song 2024-09-10 14:44 ` Dave Thaler 2024-09-10 15:21 ` Alexei Starovoitov 1 sibling, 2 replies; 16+ messages in thread From: Yonghong Song @ 2024-09-10 14:21 UTC (permalink / raw) To: Alexei Starovoitov, Zac Ecob, Daniel Borkmann; +Cc: bpf@vger.kernel.org On 9/9/24 10:29 AM, Alexei Starovoitov wrote: > On Mon, Sep 9, 2024 at 10:21 AM Zac Ecob <zacecob@protonmail.com> wrote: >> Hello, >> >> I recently received a kernel 'oops' about a divide error. >> After some research, it seems that the 'div64_s64' function used for the 'MOD'/'REM' instructions boils down to an 'idiv'. >> >> The 'dividend' is set to INT64_MIN, and the 'divisor' to -1, then because of two's complement, there is no corresponding positive value, causing the error (at least to my understanding). >> >> >> Apologies if this is already known / not a relevant concern. > Thanks for the report. This is a new issue. > > Yonghong, > > it's related to the new signed div insn. > It sounds like we need to update chk_and_div[] part of > the verifier to account for signed div differently. In verifier, we have /* [R,W]x div 0 -> 0 */ /* [R,W]x mod 0 -> [R,W]x */ What the value for Rx_a sdiv Rx_b -> ? where Rx_a = INT64_MIN and Rx_b = -1? Should we just do INT64_MIN sdiv -1 -> -1 or some other values? ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: Kernel oops caused by signed divide 2024-09-10 14:21 ` Yonghong Song @ 2024-09-10 14:44 ` Dave Thaler 2024-09-10 15:18 ` Yonghong Song 2024-09-10 15:21 ` Alexei Starovoitov 1 sibling, 1 reply; 16+ messages in thread From: Dave Thaler @ 2024-09-10 14:44 UTC (permalink / raw) To: 'Yonghong Song', 'Alexei Starovoitov', 'Zac Ecob', 'Daniel Borkmann' Cc: bpf Yonghong Song wrote: [...] > In verifier, we have > /* [R,W]x div 0 -> 0 */ > /* [R,W]x mod 0 -> [R,W]x */ > > What the value for > Rx_a sdiv Rx_b -> ? > where Rx_a = INT64_MIN and Rx_b = -1? > > Should we just do > INT64_MIN sdiv -1 -> -1 > or some other values? What happens for BPF_NEG INT64_MIN? Dave ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Kernel oops caused by signed divide 2024-09-10 14:44 ` Dave Thaler @ 2024-09-10 15:18 ` Yonghong Song 2024-09-10 15:21 ` Alexei Starovoitov 0 siblings, 1 reply; 16+ messages in thread From: Yonghong Song @ 2024-09-10 15:18 UTC (permalink / raw) To: Dave Thaler, 'Alexei Starovoitov', 'Zac Ecob', 'Daniel Borkmann' Cc: bpf On 9/10/24 7:44 AM, Dave Thaler wrote: > Yonghong Song wrote: > [...] >> In verifier, we have >> /* [R,W]x div 0 -> 0 */ >> /* [R,W]x mod 0 -> [R,W]x */ >> >> What the value for >> Rx_a sdiv Rx_b -> ? >> where Rx_a = INT64_MIN and Rx_b = -1? >> >> Should we just do >> INT64_MIN sdiv -1 -> -1 >> or some other values? > What happens for BPF_NEG INT64_MIN? Right. This is equivalent to INT64_MIN/-1. Indeed, we need check and protect for this case as well. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Kernel oops caused by signed divide 2024-09-10 15:18 ` Yonghong Song @ 2024-09-10 15:21 ` Alexei Starovoitov 2024-09-10 18:12 ` Yonghong Song 0 siblings, 1 reply; 16+ messages in thread From: Alexei Starovoitov @ 2024-09-10 15:21 UTC (permalink / raw) To: Yonghong Song; +Cc: Dave Thaler, Zac Ecob, Daniel Borkmann, bpf On Tue, Sep 10, 2024 at 8:18 AM Yonghong Song <yonghong.song@linux.dev> wrote: > > > On 9/10/24 7:44 AM, Dave Thaler wrote: > > Yonghong Song wrote: > > [...] > >> In verifier, we have > >> /* [R,W]x div 0 -> 0 */ > >> /* [R,W]x mod 0 -> [R,W]x */ > >> > >> What the value for > >> Rx_a sdiv Rx_b -> ? > >> where Rx_a = INT64_MIN and Rx_b = -1? > >> > >> Should we just do > >> INT64_MIN sdiv -1 -> -1 > >> or some other values? > > What happens for BPF_NEG INT64_MIN? > > Right. This is equivalent to INT64_MIN/-1. Indeed, we need check and protect for this case as well. why? what's wrong with bpf_neg -1 ? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Kernel oops caused by signed divide 2024-09-10 15:21 ` Alexei Starovoitov @ 2024-09-10 18:12 ` Yonghong Song 0 siblings, 0 replies; 16+ messages in thread From: Yonghong Song @ 2024-09-10 18:12 UTC (permalink / raw) To: Alexei Starovoitov; +Cc: Dave Thaler, Zac Ecob, Daniel Borkmann, bpf On 9/10/24 8:21 AM, Alexei Starovoitov wrote: > On Tue, Sep 10, 2024 at 8:18 AM Yonghong Song <yonghong.song@linux.dev> wrote: >> >> On 9/10/24 7:44 AM, Dave Thaler wrote: >>> Yonghong Song wrote: >>> [...] >>>> In verifier, we have >>>> /* [R,W]x div 0 -> 0 */ >>>> /* [R,W]x mod 0 -> [R,W]x */ >>>> >>>> What the value for >>>> Rx_a sdiv Rx_b -> ? >>>> where Rx_a = INT64_MIN and Rx_b = -1? >>>> >>>> Should we just do >>>> INT64_MIN sdiv -1 -> -1 >>>> or some other values? >>> What happens for BPF_NEG INT64_MIN? >> Right. This is equivalent to INT64_MIN/-1. Indeed, we need check and protect for this case as well. > why? what's wrong with bpf_neg -1 ? I think you are right. 'bpf_neg <num>' should not cause any exception. In this particular case 'bpf_neg LLONG_MIN' equals LLONG_MIN. On arm64, # cat t4.c #include <stdio.h> #include <limits.h> int main(void) { volatile long long a = LLONG_MIN; printf("-a = %lld\n", -a); return 0; } # gcc -O2 t4.c && ./a.out -a = -9223372036854775808 In the above -a also equals LLONG_MIN. On x86, we get the same result. $ uname -a Linux ... #1 SMP Wed Jun 5 06:21:21 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux $ cat t4.c #include <stdio.h> #include <limits.h> int main(void) { volatile long long a = LLONG_MIN; printf("-a = %lld\n", -a); return 0; } $ gcc -O2 t4.c && ./a.out -a = -9223372036854775808 $ clang -O2 t4.c && ./a.out -a = -9223372036854775808 ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Kernel oops caused by signed divide 2024-09-10 14:21 ` Yonghong Song 2024-09-10 14:44 ` Dave Thaler @ 2024-09-10 15:21 ` Alexei Starovoitov 2024-09-10 18:02 ` Yonghong Song 1 sibling, 1 reply; 16+ messages in thread From: Alexei Starovoitov @ 2024-09-10 15:21 UTC (permalink / raw) To: Yonghong Song; +Cc: Zac Ecob, Daniel Borkmann, bpf@vger.kernel.org On Tue, Sep 10, 2024 at 7:21 AM Yonghong Song <yonghong.song@linux.dev> wrote: > > > On 9/9/24 10:29 AM, Alexei Starovoitov wrote: > > On Mon, Sep 9, 2024 at 10:21 AM Zac Ecob <zacecob@protonmail.com> wrote: > >> Hello, > >> > >> I recently received a kernel 'oops' about a divide error. > >> After some research, it seems that the 'div64_s64' function used for the 'MOD'/'REM' instructions boils down to an 'idiv'. > >> > >> The 'dividend' is set to INT64_MIN, and the 'divisor' to -1, then because of two's complement, there is no corresponding positive value, causing the error (at least to my understanding). > >> > >> > >> Apologies if this is already known / not a relevant concern. > > Thanks for the report. This is a new issue. > > > > Yonghong, > > > > it's related to the new signed div insn. > > It sounds like we need to update chk_and_div[] part of > > the verifier to account for signed div differently. > > In verifier, we have > /* [R,W]x div 0 -> 0 */ > /* [R,W]x mod 0 -> [R,W]x */ the verifier is doing what hw does. In this case this is arm64 behavior. > What the value for > Rx_a sdiv Rx_b -> ? > where Rx_a = INT64_MIN and Rx_b = -1? Why does it matter what Rx_a contains ? What cpus do in this case? > Should we just do > INT64_MIN sdiv -1 -> -1 > or some other values? > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Kernel oops caused by signed divide 2024-09-10 15:21 ` Alexei Starovoitov @ 2024-09-10 18:02 ` Yonghong Song 2024-09-10 18:25 ` Alexei Starovoitov 0 siblings, 1 reply; 16+ messages in thread From: Yonghong Song @ 2024-09-10 18:02 UTC (permalink / raw) To: Alexei Starovoitov; +Cc: Zac Ecob, Daniel Borkmann, bpf@vger.kernel.org On 9/10/24 8:21 AM, Alexei Starovoitov wrote: > On Tue, Sep 10, 2024 at 7:21 AM Yonghong Song <yonghong.song@linux.dev> wrote: >> >> On 9/9/24 10:29 AM, Alexei Starovoitov wrote: >>> On Mon, Sep 9, 2024 at 10:21 AM Zac Ecob <zacecob@protonmail.com> wrote: >>>> Hello, >>>> >>>> I recently received a kernel 'oops' about a divide error. >>>> After some research, it seems that the 'div64_s64' function used for the 'MOD'/'REM' instructions boils down to an 'idiv'. >>>> >>>> The 'dividend' is set to INT64_MIN, and the 'divisor' to -1, then because of two's complement, there is no corresponding positive value, causing the error (at least to my understanding). >>>> >>>> >>>> Apologies if this is already known / not a relevant concern. >>> Thanks for the report. This is a new issue. >>> >>> Yonghong, >>> >>> it's related to the new signed div insn. >>> It sounds like we need to update chk_and_div[] part of >>> the verifier to account for signed div differently. >> In verifier, we have >> /* [R,W]x div 0 -> 0 */ >> /* [R,W]x mod 0 -> [R,W]x */ > the verifier is doing what hw does. In this case this is arm64 behavior. Okay, I see. I tried on a arm64 machine it indeed hehaves like the above. # uname -a Linux ... #1 SMP PREEMPT_DYNAMIC Thu Aug 1 06:58:32 PDT 2024 aarch64 aarch64 aarch64 GNU/Linux # cat t2.c #include <stdio.h> #include <limits.h> int main(void) { volatile long long a = 5; volatile long long b = 0; printf("a/b = %lld\n", a/b); return 0; } # cat t3.c #include <stdio.h> #include <limits.h> int main(void) { volatile long long a = 5; volatile long long b = 0; printf("a%%b = %lld\n", a%b); return 0; } # gcc -O2 t2.c && ./a.out a/b = 0 # gcc -O2 t3.c && ./a.out a%b = 5 on arm64, clang18 compiled binary has the same result # clang -O2 t2.c && ./a.out a/b = 0 # clang -O2 t3.c && ./a.out a%b = 5 The same source code, compiled on x86_64 with -O2 as well, it generates: Floating point exception (core dumped) > >> What the value for >> Rx_a sdiv Rx_b -> ? >> where Rx_a = INT64_MIN and Rx_b = -1? > Why does it matter what Rx_a contains ? It does matter. See below: on arm64: # cat t1.c #include <stdio.h> #include <limits.h> int main(void) { volatile long long a = LLONG_MIN; volatile long long b = -1; printf("a/b = %lld\n", a/b); return 0; } # clang -O2 t1.c && ./a.out a/b = -9223372036854775808 # gcc -O2 t1.c && ./a.out a/b = -9223372036854775808 So the result of a/b is LLONG_MIN The same code will cause exception on x86_64: $ uname -a Linux ... #1 SMP Wed Jun 5 06:21:21 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux [yhs@devvm1513.prn0 ~]$ gcc -O2 t1.c && ./a.out Floating point exception (core dumped) [yhs@devvm1513.prn0 ~]$ clang -O2 t1.c && ./a.out Floating point exception (core dumped) So this is what we care about. So I guess we can follow arm64 result too. > > What cpus do in this case? See above. arm64 produces *some* result while x64 cause exception. We do need to special handle for LLONG_MIN/(-1) case. > >> Should we just do >> INT64_MIN sdiv -1 -> -1 >> or some other values? >> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Kernel oops caused by signed divide 2024-09-10 18:02 ` Yonghong Song @ 2024-09-10 18:25 ` Alexei Starovoitov 2024-09-10 19:32 ` Yonghong Song 0 siblings, 1 reply; 16+ messages in thread From: Alexei Starovoitov @ 2024-09-10 18:25 UTC (permalink / raw) To: Yonghong Song; +Cc: Zac Ecob, Daniel Borkmann, bpf@vger.kernel.org On Tue, Sep 10, 2024 at 11:02 AM Yonghong Song <yonghong.song@linux.dev> wrote: > > > On 9/10/24 8:21 AM, Alexei Starovoitov wrote: > > On Tue, Sep 10, 2024 at 7:21 AM Yonghong Song <yonghong.song@linux.dev> wrote: > >> > >> On 9/9/24 10:29 AM, Alexei Starovoitov wrote: > >>> On Mon, Sep 9, 2024 at 10:21 AM Zac Ecob <zacecob@protonmail.com> wrote: > >>>> Hello, > >>>> > >>>> I recently received a kernel 'oops' about a divide error. > >>>> After some research, it seems that the 'div64_s64' function used for the 'MOD'/'REM' instructions boils down to an 'idiv'. > >>>> > >>>> The 'dividend' is set to INT64_MIN, and the 'divisor' to -1, then because of two's complement, there is no corresponding positive value, causing the error (at least to my understanding). > >>>> > >>>> > >>>> Apologies if this is already known / not a relevant concern. > >>> Thanks for the report. This is a new issue. > >>> > >>> Yonghong, > >>> > >>> it's related to the new signed div insn. > >>> It sounds like we need to update chk_and_div[] part of > >>> the verifier to account for signed div differently. > >> In verifier, we have > >> /* [R,W]x div 0 -> 0 */ > >> /* [R,W]x mod 0 -> [R,W]x */ > > the verifier is doing what hw does. In this case this is arm64 behavior. > > Okay, I see. I tried on a arm64 machine it indeed hehaves like the above. > > # uname -a > Linux ... #1 SMP PREEMPT_DYNAMIC Thu Aug 1 06:58:32 PDT 2024 aarch64 aarch64 aarch64 GNU/Linux > # cat t2.c > #include <stdio.h> > #include <limits.h> > int main(void) { > volatile long long a = 5; > volatile long long b = 0; > printf("a/b = %lld\n", a/b); > return 0; > } > # cat t3.c > #include <stdio.h> > #include <limits.h> > int main(void) { > volatile long long a = 5; > volatile long long b = 0; > printf("a%%b = %lld\n", a%b); > return 0; > } > # gcc -O2 t2.c && ./a.out > a/b = 0 > # gcc -O2 t3.c && ./a.out > a%b = 5 > > on arm64, clang18 compiled binary has the same result > > # clang -O2 t2.c && ./a.out > a/b = 0 > # clang -O2 t3.c && ./a.out > a%b = 5 > > The same source code, compiled on x86_64 with -O2 as well, > it generates: > Floating point exception (core dumped) > > > > >> What the value for > >> Rx_a sdiv Rx_b -> ? > >> where Rx_a = INT64_MIN and Rx_b = -1? > > Why does it matter what Rx_a contains ? > > It does matter. See below: > > on arm64: > > # cat t1.c > #include <stdio.h> > #include <limits.h> > int main(void) { > volatile long long a = LLONG_MIN; > volatile long long b = -1; > printf("a/b = %lld\n", a/b); > return 0; > } > # clang -O2 t1.c && ./a.out > a/b = -9223372036854775808 > # gcc -O2 t1.c && ./a.out > a/b = -9223372036854775808 > > So the result of a/b is LLONG_MIN > > The same code will cause exception on x86_64: > > $ uname -a > Linux ... #1 SMP Wed Jun 5 06:21:21 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux > [yhs@devvm1513.prn0 ~]$ gcc -O2 t1.c && ./a.out > Floating point exception (core dumped) > [yhs@devvm1513.prn0 ~]$ clang -O2 t1.c && ./a.out > Floating point exception (core dumped) > > So this is what we care about. > > So I guess we can follow arm64 result too. > > > > > What cpus do in this case? > > See above. arm64 produces *some* result while x64 cause exception. > We do need to special handle for LLONG_MIN/(-1) case. My point about Rx_a that idiv will cause out-of-range exception for many other values than Rx_a == INT64_MIN. I'm not sure that divisor -1 is the only such case either. Probably is, since intuitively -2 and all other divisors should fit fine. So the check likely needs Rx_b == -1 and a check for high bit in Rx_a ? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Kernel oops caused by signed divide 2024-09-10 18:25 ` Alexei Starovoitov @ 2024-09-10 19:32 ` Yonghong Song 2024-09-10 21:53 ` Alexei Starovoitov 0 siblings, 1 reply; 16+ messages in thread From: Yonghong Song @ 2024-09-10 19:32 UTC (permalink / raw) To: Alexei Starovoitov; +Cc: Zac Ecob, Daniel Borkmann, bpf@vger.kernel.org On 9/10/24 11:25 AM, Alexei Starovoitov wrote: > On Tue, Sep 10, 2024 at 11:02 AM Yonghong Song <yonghong.song@linux.dev> wrote: >> >> On 9/10/24 8:21 AM, Alexei Starovoitov wrote: >>> On Tue, Sep 10, 2024 at 7:21 AM Yonghong Song <yonghong.song@linux.dev> wrote: >>>> On 9/9/24 10:29 AM, Alexei Starovoitov wrote: >>>>> On Mon, Sep 9, 2024 at 10:21 AM Zac Ecob <zacecob@protonmail.com> wrote: >>>>>> Hello, >>>>>> >>>>>> I recently received a kernel 'oops' about a divide error. >>>>>> After some research, it seems that the 'div64_s64' function used for the 'MOD'/'REM' instructions boils down to an 'idiv'. >>>>>> >>>>>> The 'dividend' is set to INT64_MIN, and the 'divisor' to -1, then because of two's complement, there is no corresponding positive value, causing the error (at least to my understanding). >>>>>> >>>>>> >>>>>> Apologies if this is already known / not a relevant concern. >>>>> Thanks for the report. This is a new issue. >>>>> >>>>> Yonghong, >>>>> >>>>> it's related to the new signed div insn. >>>>> It sounds like we need to update chk_and_div[] part of >>>>> the verifier to account for signed div differently. >>>> In verifier, we have >>>> /* [R,W]x div 0 -> 0 */ >>>> /* [R,W]x mod 0 -> [R,W]x */ >>> the verifier is doing what hw does. In this case this is arm64 behavior. >> Okay, I see. I tried on a arm64 machine it indeed hehaves like the above. >> >> # uname -a >> Linux ... #1 SMP PREEMPT_DYNAMIC Thu Aug 1 06:58:32 PDT 2024 aarch64 aarch64 aarch64 GNU/Linux >> # cat t2.c >> #include <stdio.h> >> #include <limits.h> >> int main(void) { >> volatile long long a = 5; >> volatile long long b = 0; >> printf("a/b = %lld\n", a/b); >> return 0; >> } >> # cat t3.c >> #include <stdio.h> >> #include <limits.h> >> int main(void) { >> volatile long long a = 5; >> volatile long long b = 0; >> printf("a%%b = %lld\n", a%b); >> return 0; >> } >> # gcc -O2 t2.c && ./a.out >> a/b = 0 >> # gcc -O2 t3.c && ./a.out >> a%b = 5 >> >> on arm64, clang18 compiled binary has the same result >> >> # clang -O2 t2.c && ./a.out >> a/b = 0 >> # clang -O2 t3.c && ./a.out >> a%b = 5 >> >> The same source code, compiled on x86_64 with -O2 as well, >> it generates: >> Floating point exception (core dumped) >> >>>> What the value for >>>> Rx_a sdiv Rx_b -> ? >>>> where Rx_a = INT64_MIN and Rx_b = -1? >>> Why does it matter what Rx_a contains ? >> It does matter. See below: >> >> on arm64: >> >> # cat t1.c >> #include <stdio.h> >> #include <limits.h> >> int main(void) { >> volatile long long a = LLONG_MIN; >> volatile long long b = -1; >> printf("a/b = %lld\n", a/b); >> return 0; >> } >> # clang -O2 t1.c && ./a.out >> a/b = -9223372036854775808 >> # gcc -O2 t1.c && ./a.out >> a/b = -9223372036854775808 >> >> So the result of a/b is LLONG_MIN >> >> The same code will cause exception on x86_64: >> >> $ uname -a >> Linux ... #1 SMP Wed Jun 5 06:21:21 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux >> [yhs@devvm1513.prn0 ~]$ gcc -O2 t1.c && ./a.out >> Floating point exception (core dumped) >> [yhs@devvm1513.prn0 ~]$ clang -O2 t1.c && ./a.out >> Floating point exception (core dumped) >> >> So this is what we care about. >> >> So I guess we can follow arm64 result too. >> >>> What cpus do in this case? >> See above. arm64 produces *some* result while x64 cause exception. >> We do need to special handle for LLONG_MIN/(-1) case. > My point about Rx_a that idiv will cause out-of-range exception > for many other values than Rx_a == INT64_MIN. > I'm not sure that divisor -1 is the only such case either. > Probably is, since intuitively -2 and all other divisors should fit fine. > So the check likely needs Rx_b == -1 and a check for high bit in Rx_a ? Looks like only Rx_a == INT64_MIN may cause the problem. All other Rx_a numbers (from INT64_MIN+1 to INT64_MAX) should be okay. Some selective testing below on x64 host: $ cat t5.c #include <stdio.h> #include <limits.h> unsigned long long res; int main(void) { volatile long long a; long long i; for (i = LLONG_MIN + 1; i <= LLONG_MIN + 100; i++) { volatile long long b = -1; a = i; res += (unsigned long long)(a/b); } for (i = LLONG_MAX - 100; i <= LLONG_MAX - 1; i++) { volatile long long b = -1; a = i; res += (unsigned long long)(a/b); } printf("res = %llx\n", res); return 0; } $ gcc -O2 t5.c && ./a.out res = 64 So I think it should be okay if the range is from LLONG_MIN + 1 to LLONG_MAX - 1. Now for LLONG_MAX/(-1) $ cat t6.c #include <stdio.h> #include <limits.h> int main(void) { volatile long long a = LLONG_MAX; volatile long long b = -1; printf("a/b = %lld\n", a/b); return 0; } $ gcc -O2 t6.c && ./a.out a/b = -9223372036854775807 It is okay too. So I think LLONG_MIN/(-1) is the only case we should take care of. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Kernel oops caused by signed divide 2024-09-10 19:32 ` Yonghong Song @ 2024-09-10 21:53 ` Alexei Starovoitov 2024-09-10 22:00 ` Yonghong Song 2024-09-10 22:43 ` Andrii Nakryiko 0 siblings, 2 replies; 16+ messages in thread From: Alexei Starovoitov @ 2024-09-10 21:53 UTC (permalink / raw) To: Yonghong Song; +Cc: Zac Ecob, Daniel Borkmann, bpf@vger.kernel.org On Tue, Sep 10, 2024 at 12:32 PM Yonghong Song <yonghong.song@linux.dev> wrote: > > > On 9/10/24 11:25 AM, Alexei Starovoitov wrote: > > On Tue, Sep 10, 2024 at 11:02 AM Yonghong Song <yonghong.song@linux.dev> wrote: > >> > >> On 9/10/24 8:21 AM, Alexei Starovoitov wrote: > >>> On Tue, Sep 10, 2024 at 7:21 AM Yonghong Song <yonghong.song@linux.dev> wrote: > >>>> On 9/9/24 10:29 AM, Alexei Starovoitov wrote: > >>>>> On Mon, Sep 9, 2024 at 10:21 AM Zac Ecob <zacecob@protonmail.com> wrote: > >>>>>> Hello, > >>>>>> > >>>>>> I recently received a kernel 'oops' about a divide error. > >>>>>> After some research, it seems that the 'div64_s64' function used for the 'MOD'/'REM' instructions boils down to an 'idiv'. > >>>>>> > >>>>>> The 'dividend' is set to INT64_MIN, and the 'divisor' to -1, then because of two's complement, there is no corresponding positive value, causing the error (at least to my understanding). > >>>>>> > >>>>>> > >>>>>> Apologies if this is already known / not a relevant concern. > >>>>> Thanks for the report. This is a new issue. > >>>>> > >>>>> Yonghong, > >>>>> > >>>>> it's related to the new signed div insn. > >>>>> It sounds like we need to update chk_and_div[] part of > >>>>> the verifier to account for signed div differently. > >>>> In verifier, we have > >>>> /* [R,W]x div 0 -> 0 */ > >>>> /* [R,W]x mod 0 -> [R,W]x */ > >>> the verifier is doing what hw does. In this case this is arm64 behavior. > >> Okay, I see. I tried on a arm64 machine it indeed hehaves like the above. > >> > >> # uname -a > >> Linux ... #1 SMP PREEMPT_DYNAMIC Thu Aug 1 06:58:32 PDT 2024 aarch64 aarch64 aarch64 GNU/Linux > >> # cat t2.c > >> #include <stdio.h> > >> #include <limits.h> > >> int main(void) { > >> volatile long long a = 5; > >> volatile long long b = 0; > >> printf("a/b = %lld\n", a/b); > >> return 0; > >> } > >> # cat t3.c > >> #include <stdio.h> > >> #include <limits.h> > >> int main(void) { > >> volatile long long a = 5; > >> volatile long long b = 0; > >> printf("a%%b = %lld\n", a%b); > >> return 0; > >> } > >> # gcc -O2 t2.c && ./a.out > >> a/b = 0 > >> # gcc -O2 t3.c && ./a.out > >> a%b = 5 > >> > >> on arm64, clang18 compiled binary has the same result > >> > >> # clang -O2 t2.c && ./a.out > >> a/b = 0 > >> # clang -O2 t3.c && ./a.out > >> a%b = 5 > >> > >> The same source code, compiled on x86_64 with -O2 as well, > >> it generates: > >> Floating point exception (core dumped) > >> > >>>> What the value for > >>>> Rx_a sdiv Rx_b -> ? > >>>> where Rx_a = INT64_MIN and Rx_b = -1? > >>> Why does it matter what Rx_a contains ? > >> It does matter. See below: > >> > >> on arm64: > >> > >> # cat t1.c > >> #include <stdio.h> > >> #include <limits.h> > >> int main(void) { > >> volatile long long a = LLONG_MIN; > >> volatile long long b = -1; > >> printf("a/b = %lld\n", a/b); > >> return 0; > >> } > >> # clang -O2 t1.c && ./a.out > >> a/b = -9223372036854775808 > >> # gcc -O2 t1.c && ./a.out > >> a/b = -9223372036854775808 > >> > >> So the result of a/b is LLONG_MIN > >> > >> The same code will cause exception on x86_64: > >> > >> $ uname -a > >> Linux ... #1 SMP Wed Jun 5 06:21:21 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux > >> [yhs@devvm1513.prn0 ~]$ gcc -O2 t1.c && ./a.out > >> Floating point exception (core dumped) > >> [yhs@devvm1513.prn0 ~]$ clang -O2 t1.c && ./a.out > >> Floating point exception (core dumped) > >> > >> So this is what we care about. > >> > >> So I guess we can follow arm64 result too. > >> > >>> What cpus do in this case? > >> See above. arm64 produces *some* result while x64 cause exception. > >> We do need to special handle for LLONG_MIN/(-1) case. > > My point about Rx_a that idiv will cause out-of-range exception > > for many other values than Rx_a == INT64_MIN. > > I'm not sure that divisor -1 is the only such case either. > > Probably is, since intuitively -2 and all other divisors should fit fine. > > So the check likely needs Rx_b == -1 and a check for high bit in Rx_a ? > > Looks like only Rx_a == INT64_MIN may cause the problem. > All other Rx_a numbers (from INT64_MIN+1 to INT64_MAX) > should be okay. Some selective testing below on x64 host: > > $ cat t5.c > #include <stdio.h> > #include <limits.h> > > unsigned long long res; > int main(void) { > volatile long long a; > long long i; > for (i = LLONG_MIN + 1; i <= LLONG_MIN + 100; i++) { > volatile long long b = -1; > a = i; > res += (unsigned long long)(a/b); > } > for (i = LLONG_MAX - 100; i <= LLONG_MAX - 1; i++) { Changing this test to i <= LLONG_MAX and compiling with gcc -O0 or clang -O2 or clang -O0 is causing an exception, because 'a' becomes LLONG_MIN. Compilers are doing some odd code gen. I don't understand how 'i' can wrap this way. > volatile long long b = -1; > a = i; > res += (unsigned long long)(a/b); > } > printf("res = %llx\n", res); > return 0; > } > $ gcc -O2 t5.c && ./a.out > res = 64 > > So I think it should be okay if the range is from LLONG_MIN + 1 > to LLONG_MAX - 1. > > Now for LLONG_MAX/(-1) > > $ cat t6.c > #include <stdio.h> > #include <limits.h> > int main(void) { > volatile long long a = LLONG_MAX; > volatile long long b = -1; > printf("a/b = %lld\n", a/b); > return 0; > } > $ gcc -O2 t6.c && ./a.out > a/b = -9223372036854775807 > > It is okay too. So I think LLONG_MIN/(-1) is the only case > we should take care of. The test shows that that's the case, but I still can wrap my head around that only LLONG_MIN/(-1) is a problem. Any math experts can explain this? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Kernel oops caused by signed divide 2024-09-10 21:53 ` Alexei Starovoitov @ 2024-09-10 22:00 ` Yonghong Song 2024-09-10 22:43 ` Andrii Nakryiko 1 sibling, 0 replies; 16+ messages in thread From: Yonghong Song @ 2024-09-10 22:00 UTC (permalink / raw) To: Alexei Starovoitov; +Cc: Zac Ecob, Daniel Borkmann, bpf@vger.kernel.org On 9/10/24 2:53 PM, Alexei Starovoitov wrote: > On Tue, Sep 10, 2024 at 12:32 PM Yonghong Song <yonghong.song@linux.dev> wrote: >> >> On 9/10/24 11:25 AM, Alexei Starovoitov wrote: >>> On Tue, Sep 10, 2024 at 11:02 AM Yonghong Song <yonghong.song@linux.dev> wrote: >>>> On 9/10/24 8:21 AM, Alexei Starovoitov wrote: >>>>> On Tue, Sep 10, 2024 at 7:21 AM Yonghong Song <yonghong.song@linux.dev> wrote: >>>>>> On 9/9/24 10:29 AM, Alexei Starovoitov wrote: >>>>>>> On Mon, Sep 9, 2024 at 10:21 AM Zac Ecob <zacecob@protonmail.com> wrote: >>>>>>>> Hello, >>>>>>>> >>>>>>>> I recently received a kernel 'oops' about a divide error. >>>>>>>> After some research, it seems that the 'div64_s64' function used for the 'MOD'/'REM' instructions boils down to an 'idiv'. >>>>>>>> >>>>>>>> The 'dividend' is set to INT64_MIN, and the 'divisor' to -1, then because of two's complement, there is no corresponding positive value, causing the error (at least to my understanding). >>>>>>>> >>>>>>>> >>>>>>>> Apologies if this is already known / not a relevant concern. >>>>>>> Thanks for the report. This is a new issue. >>>>>>> >>>>>>> Yonghong, >>>>>>> >>>>>>> it's related to the new signed div insn. >>>>>>> It sounds like we need to update chk_and_div[] part of >>>>>>> the verifier to account for signed div differently. >>>>>> In verifier, we have >>>>>> /* [R,W]x div 0 -> 0 */ >>>>>> /* [R,W]x mod 0 -> [R,W]x */ >>>>> the verifier is doing what hw does. In this case this is arm64 behavior. >>>> Okay, I see. I tried on a arm64 machine it indeed hehaves like the above. >>>> >>>> # uname -a >>>> Linux ... #1 SMP PREEMPT_DYNAMIC Thu Aug 1 06:58:32 PDT 2024 aarch64 aarch64 aarch64 GNU/Linux >>>> # cat t2.c >>>> #include <stdio.h> >>>> #include <limits.h> >>>> int main(void) { >>>> volatile long long a = 5; >>>> volatile long long b = 0; >>>> printf("a/b = %lld\n", a/b); >>>> return 0; >>>> } >>>> # cat t3.c >>>> #include <stdio.h> >>>> #include <limits.h> >>>> int main(void) { >>>> volatile long long a = 5; >>>> volatile long long b = 0; >>>> printf("a%%b = %lld\n", a%b); >>>> return 0; >>>> } >>>> # gcc -O2 t2.c && ./a.out >>>> a/b = 0 >>>> # gcc -O2 t3.c && ./a.out >>>> a%b = 5 >>>> >>>> on arm64, clang18 compiled binary has the same result >>>> >>>> # clang -O2 t2.c && ./a.out >>>> a/b = 0 >>>> # clang -O2 t3.c && ./a.out >>>> a%b = 5 >>>> >>>> The same source code, compiled on x86_64 with -O2 as well, >>>> it generates: >>>> Floating point exception (core dumped) >>>> >>>>>> What the value for >>>>>> Rx_a sdiv Rx_b -> ? >>>>>> where Rx_a = INT64_MIN and Rx_b = -1? >>>>> Why does it matter what Rx_a contains ? >>>> It does matter. See below: >>>> >>>> on arm64: >>>> >>>> # cat t1.c >>>> #include <stdio.h> >>>> #include <limits.h> >>>> int main(void) { >>>> volatile long long a = LLONG_MIN; >>>> volatile long long b = -1; >>>> printf("a/b = %lld\n", a/b); >>>> return 0; >>>> } >>>> # clang -O2 t1.c && ./a.out >>>> a/b = -9223372036854775808 >>>> # gcc -O2 t1.c && ./a.out >>>> a/b = -9223372036854775808 >>>> >>>> So the result of a/b is LLONG_MIN >>>> >>>> The same code will cause exception on x86_64: >>>> >>>> $ uname -a >>>> Linux ... #1 SMP Wed Jun 5 06:21:21 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux >>>> [yhs@devvm1513.prn0 ~]$ gcc -O2 t1.c && ./a.out >>>> Floating point exception (core dumped) >>>> [yhs@devvm1513.prn0 ~]$ clang -O2 t1.c && ./a.out >>>> Floating point exception (core dumped) >>>> >>>> So this is what we care about. >>>> >>>> So I guess we can follow arm64 result too. >>>> >>>>> What cpus do in this case? >>>> See above. arm64 produces *some* result while x64 cause exception. >>>> We do need to special handle for LLONG_MIN/(-1) case. >>> My point about Rx_a that idiv will cause out-of-range exception >>> for many other values than Rx_a == INT64_MIN. >>> I'm not sure that divisor -1 is the only such case either. >>> Probably is, since intuitively -2 and all other divisors should fit fine. >>> So the check likely needs Rx_b == -1 and a check for high bit in Rx_a ? >> Looks like only Rx_a == INT64_MIN may cause the problem. >> All other Rx_a numbers (from INT64_MIN+1 to INT64_MAX) >> should be okay. Some selective testing below on x64 host: >> >> $ cat t5.c >> #include <stdio.h> >> #include <limits.h> >> >> unsigned long long res; >> int main(void) { >> volatile long long a; >> long long i; >> for (i = LLONG_MIN + 1; i <= LLONG_MIN + 100; i++) { >> volatile long long b = -1; >> a = i; >> res += (unsigned long long)(a/b); >> } >> for (i = LLONG_MAX - 100; i <= LLONG_MAX - 1; i++) { > Changing this test to i <= LLONG_MAX > and compiling with gcc -O0 or clang -O2 or clang -O0 > is causing an exception, > because 'a' becomes LLONG_MIN. This is my theory. If change to i <= LLONG_MAX, then after i = LLONG_MAX, it will do 'i++' and then it will become tricky since then undefined behavior will pick in as 'i++' will become out of range. > Compilers are doing some odd code gen. > I don't understand how 'i' can wrap this way. > >> volatile long long b = -1; >> a = i; >> res += (unsigned long long)(a/b); >> } >> printf("res = %llx\n", res); >> return 0; >> } >> $ gcc -O2 t5.c && ./a.out >> res = 64 >> >> So I think it should be okay if the range is from LLONG_MIN + 1 >> to LLONG_MAX - 1. >> >> Now for LLONG_MAX/(-1) >> >> $ cat t6.c >> #include <stdio.h> >> #include <limits.h> >> int main(void) { >> volatile long long a = LLONG_MAX; >> volatile long long b = -1; >> printf("a/b = %lld\n", a/b); >> return 0; >> } >> $ gcc -O2 t6.c && ./a.out >> a/b = -9223372036854775807 >> >> It is okay too. So I think LLONG_MIN/(-1) is the only case >> we should take care of. > The test shows that that's the case, but I still can wrap > my head around that only LLONG_MIN/(-1) is a problem. > > Any math experts can explain this? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Kernel oops caused by signed divide 2024-09-10 21:53 ` Alexei Starovoitov 2024-09-10 22:00 ` Yonghong Song @ 2024-09-10 22:43 ` Andrii Nakryiko 1 sibling, 0 replies; 16+ messages in thread From: Andrii Nakryiko @ 2024-09-10 22:43 UTC (permalink / raw) To: Alexei Starovoitov Cc: Yonghong Song, Zac Ecob, Daniel Borkmann, bpf@vger.kernel.org On Tue, Sep 10, 2024 at 2:53 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Tue, Sep 10, 2024 at 12:32 PM Yonghong Song <yonghong.song@linux.dev> wrote: > > > > > > On 9/10/24 11:25 AM, Alexei Starovoitov wrote: > > > On Tue, Sep 10, 2024 at 11:02 AM Yonghong Song <yonghong.song@linux.dev> wrote: > > >> > > >> On 9/10/24 8:21 AM, Alexei Starovoitov wrote: > > >>> On Tue, Sep 10, 2024 at 7:21 AM Yonghong Song <yonghong.song@linux.dev> wrote: > > >>>> On 9/9/24 10:29 AM, Alexei Starovoitov wrote: > > >>>>> On Mon, Sep 9, 2024 at 10:21 AM Zac Ecob <zacecob@protonmail.com> wrote: > > >>>>>> Hello, > > >>>>>> > > >>>>>> I recently received a kernel 'oops' about a divide error. > > >>>>>> After some research, it seems that the 'div64_s64' function used for the 'MOD'/'REM' instructions boils down to an 'idiv'. > > >>>>>> > > >>>>>> The 'dividend' is set to INT64_MIN, and the 'divisor' to -1, then because of two's complement, there is no corresponding positive value, causing the error (at least to my understanding). > > >>>>>> > > >>>>>> > > >>>>>> Apologies if this is already known / not a relevant concern. > > >>>>> Thanks for the report. This is a new issue. > > >>>>> > > >>>>> Yonghong, > > >>>>> > > >>>>> it's related to the new signed div insn. > > >>>>> It sounds like we need to update chk_and_div[] part of > > >>>>> the verifier to account for signed div differently. > > >>>> In verifier, we have > > >>>> /* [R,W]x div 0 -> 0 */ > > >>>> /* [R,W]x mod 0 -> [R,W]x */ > > >>> the verifier is doing what hw does. In this case this is arm64 behavior. > > >> Okay, I see. I tried on a arm64 machine it indeed hehaves like the above. > > >> > > >> # uname -a > > >> Linux ... #1 SMP PREEMPT_DYNAMIC Thu Aug 1 06:58:32 PDT 2024 aarch64 aarch64 aarch64 GNU/Linux > > >> # cat t2.c > > >> #include <stdio.h> > > >> #include <limits.h> > > >> int main(void) { > > >> volatile long long a = 5; > > >> volatile long long b = 0; > > >> printf("a/b = %lld\n", a/b); > > >> return 0; > > >> } > > >> # cat t3.c > > >> #include <stdio.h> > > >> #include <limits.h> > > >> int main(void) { > > >> volatile long long a = 5; > > >> volatile long long b = 0; > > >> printf("a%%b = %lld\n", a%b); > > >> return 0; > > >> } > > >> # gcc -O2 t2.c && ./a.out > > >> a/b = 0 > > >> # gcc -O2 t3.c && ./a.out > > >> a%b = 5 > > >> > > >> on arm64, clang18 compiled binary has the same result > > >> > > >> # clang -O2 t2.c && ./a.out > > >> a/b = 0 > > >> # clang -O2 t3.c && ./a.out > > >> a%b = 5 > > >> > > >> The same source code, compiled on x86_64 with -O2 as well, > > >> it generates: > > >> Floating point exception (core dumped) > > >> > > >>>> What the value for > > >>>> Rx_a sdiv Rx_b -> ? > > >>>> where Rx_a = INT64_MIN and Rx_b = -1? > > >>> Why does it matter what Rx_a contains ? > > >> It does matter. See below: > > >> > > >> on arm64: > > >> > > >> # cat t1.c > > >> #include <stdio.h> > > >> #include <limits.h> > > >> int main(void) { > > >> volatile long long a = LLONG_MIN; > > >> volatile long long b = -1; > > >> printf("a/b = %lld\n", a/b); > > >> return 0; > > >> } > > >> # clang -O2 t1.c && ./a.out > > >> a/b = -9223372036854775808 > > >> # gcc -O2 t1.c && ./a.out > > >> a/b = -9223372036854775808 > > >> > > >> So the result of a/b is LLONG_MIN > > >> > > >> The same code will cause exception on x86_64: > > >> > > >> $ uname -a > > >> Linux ... #1 SMP Wed Jun 5 06:21:21 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux > > >> [yhs@devvm1513.prn0 ~]$ gcc -O2 t1.c && ./a.out > > >> Floating point exception (core dumped) > > >> [yhs@devvm1513.prn0 ~]$ clang -O2 t1.c && ./a.out > > >> Floating point exception (core dumped) > > >> > > >> So this is what we care about. > > >> > > >> So I guess we can follow arm64 result too. > > >> > > >>> What cpus do in this case? > > >> See above. arm64 produces *some* result while x64 cause exception. > > >> We do need to special handle for LLONG_MIN/(-1) case. > > > My point about Rx_a that idiv will cause out-of-range exception > > > for many other values than Rx_a == INT64_MIN. > > > I'm not sure that divisor -1 is the only such case either. > > > Probably is, since intuitively -2 and all other divisors should fit fine. > > > So the check likely needs Rx_b == -1 and a check for high bit in Rx_a ? > > > > Looks like only Rx_a == INT64_MIN may cause the problem. > > All other Rx_a numbers (from INT64_MIN+1 to INT64_MAX) > > should be okay. Some selective testing below on x64 host: > > > > $ cat t5.c > > #include <stdio.h> > > #include <limits.h> > > > > unsigned long long res; > > int main(void) { > > volatile long long a; > > long long i; > > for (i = LLONG_MIN + 1; i <= LLONG_MIN + 100; i++) { > > volatile long long b = -1; > > a = i; > > res += (unsigned long long)(a/b); > > } > > for (i = LLONG_MAX - 100; i <= LLONG_MAX - 1; i++) { > > Changing this test to i <= LLONG_MAX > and compiling with gcc -O0 or clang -O2 or clang -O0 > is causing an exception, > because 'a' becomes LLONG_MIN. > Compilers are doing some odd code gen. > I don't understand how 'i' can wrap this way. > > > volatile long long b = -1; > > a = i; > > res += (unsigned long long)(a/b); > > } > > printf("res = %llx\n", res); > > return 0; > > } > > $ gcc -O2 t5.c && ./a.out > > res = 64 > > > > So I think it should be okay if the range is from LLONG_MIN + 1 > > to LLONG_MAX - 1. > > > > Now for LLONG_MAX/(-1) > > > > $ cat t6.c > > #include <stdio.h> > > #include <limits.h> > > int main(void) { > > volatile long long a = LLONG_MAX; > > volatile long long b = -1; > > printf("a/b = %lld\n", a/b); > > return 0; > > } > > $ gcc -O2 t6.c && ./a.out > > a/b = -9223372036854775807 > > > > It is okay too. So I think LLONG_MIN/(-1) is the only case > > we should take care of. > > The test shows that that's the case, but I still can wrap > my head around that only LLONG_MIN/(-1) is a problem. > > Any math experts can explain this? > Not a math expert, but this is because LLONG_MIN / (-1) needs to be -LLONG_MIN, right? But -LLONG_MIN is not representable in 2-complement representation, because positive and negative sides are not "symmetrical": LLONG_MIN = -9,223,372,036,854,775,808 LLONG_MAX= 9,223,372,036,854,775,807 -LLONG_MIN would be 9,223,372,036,854,775,808, which is beyond the representable range for 64-bit signed integer. That's why Dave asked about BPF_NEG for LLONG_MIN, it's a similar problem, its result is unrepresentable value. So in practice -LLONG_MIN == LLONG_MIN :) $ cat main.c #include <stdio.h> #include <stdint.h> int main() { long long x = INT64_MIN; printf("%lld %llx %llx\n", x, x, -x); return 0; } $ cc main.c && ./a.out -9223372036854775808 8000000000000000 8000000000000000 ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2024-09-10 22:43 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-09-09 17:21 Kernel oops caused by signed divide Zac Ecob 2024-09-09 17:27 ` Yonghong Song 2024-09-09 17:29 ` Alexei Starovoitov 2024-09-09 23:47 ` Yonghong Song 2024-09-10 14:21 ` Yonghong Song 2024-09-10 14:44 ` Dave Thaler 2024-09-10 15:18 ` Yonghong Song 2024-09-10 15:21 ` Alexei Starovoitov 2024-09-10 18:12 ` Yonghong Song 2024-09-10 15:21 ` Alexei Starovoitov 2024-09-10 18:02 ` Yonghong Song 2024-09-10 18:25 ` Alexei Starovoitov 2024-09-10 19:32 ` Yonghong Song 2024-09-10 21:53 ` Alexei Starovoitov 2024-09-10 22:00 ` Yonghong Song 2024-09-10 22:43 ` Andrii Nakryiko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox