From: Yonghong Song <yonghong.song@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Zac Ecob <zacecob@protonmail.com>,
Daniel Borkmann <daniel@iogearbox.net>,
"bpf@vger.kernel.org" <bpf@vger.kernel.org>
Subject: Re: Kernel oops caused by signed divide
Date: Tue, 10 Sep 2024 12:32:43 -0700 [thread overview]
Message-ID: <6d0e66f9-db1c-444c-b899-1961b41de7c5@linux.dev> (raw)
In-Reply-To: <CAADnVQLaOCrxqz7rBjeTJe0EUyAGwtjDKQugyKmFdMGT5=XN4g@mail.gmail.com>
On 9/10/24 11:25 AM, Alexei Starovoitov wrote:
> On Tue, Sep 10, 2024 at 11:02 AM Yonghong Song <yonghong.song@linux.dev> wrote:
>>
>> On 9/10/24 8:21 AM, Alexei Starovoitov wrote:
>>> On Tue, Sep 10, 2024 at 7:21 AM Yonghong Song <yonghong.song@linux.dev> wrote:
>>>> On 9/9/24 10:29 AM, Alexei Starovoitov wrote:
>>>>> On Mon, Sep 9, 2024 at 10:21 AM Zac Ecob <zacecob@protonmail.com> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I recently received a kernel 'oops' about a divide error.
>>>>>> After some research, it seems that the 'div64_s64' function used for the 'MOD'/'REM' instructions boils down to an 'idiv'.
>>>>>>
>>>>>> The 'dividend' is set to INT64_MIN, and the 'divisor' to -1, then because of two's complement, there is no corresponding positive value, causing the error (at least to my understanding).
>>>>>>
>>>>>>
>>>>>> Apologies if this is already known / not a relevant concern.
>>>>> Thanks for the report. This is a new issue.
>>>>>
>>>>> Yonghong,
>>>>>
>>>>> it's related to the new signed div insn.
>>>>> It sounds like we need to update chk_and_div[] part of
>>>>> the verifier to account for signed div differently.
>>>> In verifier, we have
>>>> /* [R,W]x div 0 -> 0 */
>>>> /* [R,W]x mod 0 -> [R,W]x */
>>> the verifier is doing what hw does. In this case this is arm64 behavior.
>> Okay, I see. I tried on a arm64 machine it indeed hehaves like the above.
>>
>> # uname -a
>> Linux ... #1 SMP PREEMPT_DYNAMIC Thu Aug 1 06:58:32 PDT 2024 aarch64 aarch64 aarch64 GNU/Linux
>> # cat t2.c
>> #include <stdio.h>
>> #include <limits.h>
>> int main(void) {
>> volatile long long a = 5;
>> volatile long long b = 0;
>> printf("a/b = %lld\n", a/b);
>> return 0;
>> }
>> # cat t3.c
>> #include <stdio.h>
>> #include <limits.h>
>> int main(void) {
>> volatile long long a = 5;
>> volatile long long b = 0;
>> printf("a%%b = %lld\n", a%b);
>> return 0;
>> }
>> # gcc -O2 t2.c && ./a.out
>> a/b = 0
>> # gcc -O2 t3.c && ./a.out
>> a%b = 5
>>
>> on arm64, clang18 compiled binary has the same result
>>
>> # clang -O2 t2.c && ./a.out
>> a/b = 0
>> # clang -O2 t3.c && ./a.out
>> a%b = 5
>>
>> The same source code, compiled on x86_64 with -O2 as well,
>> it generates:
>> Floating point exception (core dumped)
>>
>>>> What the value for
>>>> Rx_a sdiv Rx_b -> ?
>>>> where Rx_a = INT64_MIN and Rx_b = -1?
>>> Why does it matter what Rx_a contains ?
>> It does matter. See below:
>>
>> on arm64:
>>
>> # cat t1.c
>> #include <stdio.h>
>> #include <limits.h>
>> int main(void) {
>> volatile long long a = LLONG_MIN;
>> volatile long long b = -1;
>> printf("a/b = %lld\n", a/b);
>> return 0;
>> }
>> # clang -O2 t1.c && ./a.out
>> a/b = -9223372036854775808
>> # gcc -O2 t1.c && ./a.out
>> a/b = -9223372036854775808
>>
>> So the result of a/b is LLONG_MIN
>>
>> The same code will cause exception on x86_64:
>>
>> $ uname -a
>> Linux ... #1 SMP Wed Jun 5 06:21:21 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux
>> [yhs@devvm1513.prn0 ~]$ gcc -O2 t1.c && ./a.out
>> Floating point exception (core dumped)
>> [yhs@devvm1513.prn0 ~]$ clang -O2 t1.c && ./a.out
>> Floating point exception (core dumped)
>>
>> So this is what we care about.
>>
>> So I guess we can follow arm64 result too.
>>
>>> What cpus do in this case?
>> See above. arm64 produces *some* result while x64 cause exception.
>> We do need to special handle for LLONG_MIN/(-1) case.
> My point about Rx_a that idiv will cause out-of-range exception
> for many other values than Rx_a == INT64_MIN.
> I'm not sure that divisor -1 is the only such case either.
> Probably is, since intuitively -2 and all other divisors should fit fine.
> So the check likely needs Rx_b == -1 and a check for high bit in Rx_a ?
Looks like only Rx_a == INT64_MIN may cause the problem.
All other Rx_a numbers (from INT64_MIN+1 to INT64_MAX)
should be okay. Some selective testing below on x64 host:
$ cat t5.c
#include <stdio.h>
#include <limits.h>
unsigned long long res;
int main(void) {
volatile long long a;
long long i;
for (i = LLONG_MIN + 1; i <= LLONG_MIN + 100; i++) {
volatile long long b = -1;
a = i;
res += (unsigned long long)(a/b);
}
for (i = LLONG_MAX - 100; i <= LLONG_MAX - 1; i++) {
volatile long long b = -1;
a = i;
res += (unsigned long long)(a/b);
}
printf("res = %llx\n", res);
return 0;
}
$ gcc -O2 t5.c && ./a.out
res = 64
So I think it should be okay if the range is from LLONG_MIN + 1
to LLONG_MAX - 1.
Now for LLONG_MAX/(-1)
$ cat t6.c
#include <stdio.h>
#include <limits.h>
int main(void) {
volatile long long a = LLONG_MAX;
volatile long long b = -1;
printf("a/b = %lld\n", a/b);
return 0;
}
$ gcc -O2 t6.c && ./a.out
a/b = -9223372036854775807
It is okay too. So I think LLONG_MIN/(-1) is the only case
we should take care of.
next prev parent reply other threads:[~2024-09-10 19:32 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-09 17:21 Kernel oops caused by signed divide Zac Ecob
2024-09-09 17:27 ` Yonghong Song
2024-09-09 17:29 ` Alexei Starovoitov
2024-09-09 23:47 ` Yonghong Song
2024-09-10 14:21 ` Yonghong Song
2024-09-10 14:44 ` Dave Thaler
2024-09-10 15:18 ` Yonghong Song
2024-09-10 15:21 ` Alexei Starovoitov
2024-09-10 18:12 ` Yonghong Song
2024-09-10 15:21 ` Alexei Starovoitov
2024-09-10 18:02 ` Yonghong Song
2024-09-10 18:25 ` Alexei Starovoitov
2024-09-10 19:32 ` Yonghong Song [this message]
2024-09-10 21:53 ` Alexei Starovoitov
2024-09-10 22:00 ` Yonghong Song
2024-09-10 22:43 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6d0e66f9-db1c-444c-b899-1961b41de7c5@linux.dev \
--to=yonghong.song@linux.dev \
--cc=alexei.starovoitov@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=zacecob@protonmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox