* divide error in x86 and cputime
@ 2025-07-07 8:14 Li,Rongqing
2025-07-07 15:11 ` Steven Rostedt
2025-07-07 22:09 ` Oleg Nesterov
0 siblings, 2 replies; 23+ messages in thread
From: Li,Rongqing @ 2025-07-07 8:14 UTC (permalink / raw)
To: oleg@redhat.com
Cc: linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, rostedt@goodmis.org,
dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
juri.lelli@redhat.com, peterz@infradead.org, mingo@redhat.com
Hi:
I see a divide error on x86 machine, the stack is below:
[78250815.703847] divide error: 0000 [#1] PREEMPT SMP NOPTI
[78250815.703852] CPU: 127 PID: 83435 Comm: killall Kdump: loaded Tainted: P OE K 5.10.0 #1
[78250815.703853] Hardware name: Inspur SSINSPURMBX-XA3-100D-B356/NF5280A6, BIOS 3.00.21 06/27/2022
[78250815.703859] RIP: 0010:cputime_adjust+0x55/0xb0
[78250815.703860] Code: 3b 4c 8b 4d 10 48 89 c6 49 8d 04 38 4c 39 c8 73 38 48 8b 45 00 48 8b 55 08 48 85 c0 74 16 48 85 d2 74 49 48 8d 0c 10 49 f7 e1 <48> f7 f1 49 39 c0 4c 0f 42 c0 4c 89 c8 4c 29 c0 48 39 c7 77 25 48
[78250815.703861] RSP: 0018:ffffa34c2517bc40 EFLAGS: 00010887
[78250815.703864] RAX: 69f98da9ba980c00 RBX: ffff976c93d2a5e0 RCX: 0000000709e00900
[78250815.703864] RDX: 00f5dfffab0fc352 RSI: 0000000000000082 RDI: ff07410dca0bcd5e
[78250815.703865] RBP: ffffa34c2517bc70 R08: 00f5dfff54f8e5ce R09: fffd213aabd74626
[78250815.703866] R10: ffffa34c2517bed8 R11: 0000000000000000 R12: ffff976c93d2a5f0
[78250815.703867] R13: ffffa34c2517bd78 R14: ffffa34c2517bd70 R15: 0000000000001000
[78250815.703868] FS: 00007f58060f97a0(0000) GS:ffff976afe9c0000(0000) knlGS:0000000000000000
[78250815.703869] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[78250815.703870] CR2: 00007f580610e000 CR3: 0000017e3b3d2004 CR4: 0000000000770ee0
[78250815.703870] PKRU: 55555554
[78250815.703871] Call Trace:
[78250815.703877] thread_group_cputime_adjusted+0x4a/0x70
[78250815.703881] do_task_stat+0x2ed/0xe00
[78250815.703885] ? khugepaged_enter_vma_merge+0x12/0xd0
[78250815.703888] proc_single_show+0x51/0xc0
[78250815.703892] seq_read_iter+0x185/0x3c0
[78250815.703895] seq_read+0x106/0x150
[78250815.703898] vfs_read+0x98/0x180
[78250815.703900] ksys_read+0x59/0xd0
[78250815.703904] do_syscall_64+0x33/0x40
[78250815.703907] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[78250815.703910] RIP: 0033:0x318aeda360
It caused by a process with many threads running very long, and utime+stime overflowed 64bit, then cause the below div
mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626, 0x09e00900);
I see the comments of mul_u64_u64_div_u64() say:
Will generate an #DE when the result doesn't fit u64, could fix with an
__ex_table[] entry when it becomes an issu
Seem __ex_table[] entry for div does not work ?
Thanks
-Li
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: divide error in x86 and cputime
2025-07-07 8:14 divide error in x86 and cputime Li,Rongqing
@ 2025-07-07 15:11 ` Steven Rostedt
2025-07-07 22:09 ` Oleg Nesterov
1 sibling, 0 replies; 23+ messages in thread
From: Steven Rostedt @ 2025-07-07 15:11 UTC (permalink / raw)
To: Li,Rongqing
Cc: oleg@redhat.com, linux-kernel@vger.kernel.org,
vschneid@redhat.com, mgorman@suse.de, bsegall@google.com,
dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
juri.lelli@redhat.com, peterz@infradead.org, mingo@redhat.com
On Mon, 7 Jul 2025 08:14:41 +0000
"Li,Rongqing" <lirongqing@baidu.com> wrote:
> Hi:
>
> I see a divide error on x86 machine, the stack is below:
>
>
> [78250815.703847] divide error: 0000 [#1] PREEMPT SMP NOPTI
> [78250815.703852] CPU: 127 PID: 83435 Comm: killall Kdump: loaded Tainted: P OE K 5.10.0 #1
Did you see this on a 5.10 kernel?
Do you see it on something more recent? Preferably the 6.15 or 6.16.
-- Steve
> [78250815.703853] Hardware name: Inspur SSINSPURMBX-XA3-100D-B356/NF5280A6, BIOS 3.00.21 06/27/2022
> [78250815.703859] RIP: 0010:cputime_adjust+0x55/0xb0
> [78250815.703860] Code: 3b 4c 8b 4d 10 48 89 c6 49 8d 04 38 4c 39 c8 73 38 48 8b 45 00 48 8b 55 08 48 85 c0 74 16 48 85 d2 74 49 48 8d 0c 10 49 f7 e1 <48> f7 f1 49 39 c0 4c 0f 42 c0 4c 89 c8 4c 29 c0 48 39 c7 77 25 48
> [78250815.703861] RSP: 0018:ffffa34c2517bc40 EFLAGS: 00010887
> [78250815.703864] RAX: 69f98da9ba980c00 RBX: ffff976c93d2a5e0 RCX: 0000000709e00900
> [78250815.703864] RDX: 00f5dfffab0fc352 RSI: 0000000000000082 RDI: ff07410dca0bcd5e
> [78250815.703865] RBP: ffffa34c2517bc70 R08: 00f5dfff54f8e5ce R09: fffd213aabd74626
> [78250815.703866] R10: ffffa34c2517bed8 R11: 0000000000000000 R12: ffff976c93d2a5f0
> [78250815.703867] R13: ffffa34c2517bd78 R14: ffffa34c2517bd70 R15: 0000000000001000
> [78250815.703868] FS: 00007f58060f97a0(0000) GS:ffff976afe9c0000(0000) knlGS:0000000000000000
> [78250815.703869] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [78250815.703870] CR2: 00007f580610e000 CR3: 0000017e3b3d2004 CR4: 0000000000770ee0
> [78250815.703870] PKRU: 55555554
> [78250815.703871] Call Trace:
> [78250815.703877] thread_group_cputime_adjusted+0x4a/0x70
> [78250815.703881] do_task_stat+0x2ed/0xe00
> [78250815.703885] ? khugepaged_enter_vma_merge+0x12/0xd0
> [78250815.703888] proc_single_show+0x51/0xc0
> [78250815.703892] seq_read_iter+0x185/0x3c0
> [78250815.703895] seq_read+0x106/0x150
> [78250815.703898] vfs_read+0x98/0x180
> [78250815.703900] ksys_read+0x59/0xd0
> [78250815.703904] do_syscall_64+0x33/0x40
> [78250815.703907] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [78250815.703910] RIP: 0033:0x318aeda360
>
>
> It caused by a process with many threads running very long, and utime+stime overflowed 64bit, then cause the below div
>
> mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626, 0x09e00900);
>
> I see the comments of mul_u64_u64_div_u64() say:
>
> Will generate an #DE when the result doesn't fit u64, could fix with an
> __ex_table[] entry when it becomes an issu
>
>
> Seem __ex_table[] entry for div does not work ?
>
> Thanks
>
> -Li
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: divide error in x86 and cputime
2025-07-07 8:14 divide error in x86 and cputime Li,Rongqing
2025-07-07 15:11 ` Steven Rostedt
@ 2025-07-07 22:09 ` Oleg Nesterov
2025-07-07 22:20 ` Steven Rostedt
2025-07-07 22:30 ` Oleg Nesterov
1 sibling, 2 replies; 23+ messages in thread
From: Oleg Nesterov @ 2025-07-07 22:09 UTC (permalink / raw)
To: Li,Rongqing, Peter Zijlstra, David Laight
Cc: linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, rostedt@goodmis.org,
dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
juri.lelli@redhat.com, mingo@redhat.com
On 07/07, Li,Rongqing wrote:
>
> [78250815.703847] divide error: 0000 [#1] PREEMPT SMP NOPTI
...
> It caused by a process with many threads running very long,
> and utime+stime overflowed 64bit, then cause the below div
>
> mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626, 0x09e00900);
>
> I see the comments of mul_u64_u64_div_u64() say:
>
> Will generate an #DE when the result doesn't fit u64, could fix with an
> __ex_table[] entry when it becomes an issu
>
> Seem __ex_table[] entry for div does not work ?
Well, the current version doesn't have an __ex_table[] entry for div...
I do not know what can/should we do in this case... Perhaps
static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
{
int ok = 0;
u64 q;
asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
_ASM_EXTABLE(1b, 2b)
: "=a" (q), "+r" (ok)
: "a" (a), "rm" (mul), "rm" (div)
: "rdx");
return ok ? q : -1ul;
}
?
Should return ULLONG_MAX on #DE.
Oleg.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: divide error in x86 and cputime
2025-07-07 22:09 ` Oleg Nesterov
@ 2025-07-07 22:20 ` Steven Rostedt
2025-07-07 22:33 ` Steven Rostedt
2025-07-07 22:30 ` Oleg Nesterov
1 sibling, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2025-07-07 22:20 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Li,Rongqing, Peter Zijlstra, David Laight,
linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
vincent.guittot@linaro.org, juri.lelli@redhat.com,
mingo@redhat.com
On Tue, 8 Jul 2025 00:09:38 +0200
Oleg Nesterov <oleg@redhat.com> wrote:
> Well, the current version doesn't have an __ex_table[] entry for div...
>
> I do not know what can/should we do in this case... Perhaps
>
> static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
> {
> int ok = 0;
> u64 q;
>
> asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
> _ASM_EXTABLE(1b, 2b)
> : "=a" (q), "+r" (ok)
> : "a" (a), "rm" (mul), "rm" (div)
> : "rdx");
>
> return ok ? q : -1ul;
> }
>
> ?
>
> Should return ULLONG_MAX on #DE.
I would say this should never happen and if it does, let the kernel crash.
-- Steve
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: divide error in x86 and cputime
2025-07-07 22:09 ` Oleg Nesterov
2025-07-07 22:20 ` Steven Rostedt
@ 2025-07-07 22:30 ` Oleg Nesterov
2025-07-07 23:41 ` 答复: [????] " Li,Rongqing
1 sibling, 1 reply; 23+ messages in thread
From: Oleg Nesterov @ 2025-07-07 22:30 UTC (permalink / raw)
To: Li,Rongqing, Peter Zijlstra, David Laight
Cc: linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, rostedt@goodmis.org,
dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
juri.lelli@redhat.com, mingo@redhat.com
On a second thought, this
mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626, 0x09e00900);
stime rtime stime + utime
looks suspicious:
- stime > stime + utime
- rtime = 0xfffd213aabd74626 is absurdly huge
so perhaps there is another problem?
Oleg.
On 07/08, Oleg Nesterov wrote:
>
> On 07/07, Li,Rongqing wrote:
> >
> > [78250815.703847] divide error: 0000 [#1] PREEMPT SMP NOPTI
>
> ...
>
> > It caused by a process with many threads running very long,
> > and utime+stime overflowed 64bit, then cause the below div
> >
> > mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626, 0x09e00900);
> >
> > I see the comments of mul_u64_u64_div_u64() say:
> >
> > Will generate an #DE when the result doesn't fit u64, could fix with an
> > __ex_table[] entry when it becomes an issu
> >
> > Seem __ex_table[] entry for div does not work ?
>
> Well, the current version doesn't have an __ex_table[] entry for div...
>
> I do not know what can/should we do in this case... Perhaps
>
> static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
> {
> int ok = 0;
> u64 q;
>
> asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
> _ASM_EXTABLE(1b, 2b)
> : "=a" (q), "+r" (ok)
> : "a" (a), "rm" (mul), "rm" (div)
> : "rdx");
>
> return ok ? q : -1ul;
> }
>
> ?
>
> Should return ULLONG_MAX on #DE.
>
> Oleg.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: divide error in x86 and cputime
2025-07-07 22:20 ` Steven Rostedt
@ 2025-07-07 22:33 ` Steven Rostedt
2025-07-07 23:00 ` Oleg Nesterov
2025-07-08 1:40 ` 答复: [????] " Li,Rongqing
0 siblings, 2 replies; 23+ messages in thread
From: Steven Rostedt @ 2025-07-07 22:33 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Li,Rongqing, Peter Zijlstra, David Laight,
linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
vincent.guittot@linaro.org, juri.lelli@redhat.com,
mingo@redhat.com
On Mon, 7 Jul 2025 18:20:56 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:
> I would say this should never happen and if it does, let the kernel crash.
>> [78250815.703852] CPU: 127 PID: 83435 Comm: killall Kdump: loaded Tainted: P OE K 5.10.0 #1
This happened on a 5.10 kernel with a proprietary module loaded, so
honestly, if it can't be reproduced on a newer kernel without any
proprietary modules loaded, I say we don't worry about it.
I also don't by the utime + stime overflowing a 64bit number.
2^64 / 2 = 2^63 = 9223372036854775808
That would be:
minutes days
v v
9223372036854775808 / 1000000000 / 60 / 60 / 24 / 365.25 = 292.27
^ ^ ^
ns -> sec hours years
So the report says they have threads running for a very long time, it would
still be 292 years of run time!
-- Steve
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: divide error in x86 and cputime
2025-07-07 22:33 ` Steven Rostedt
@ 2025-07-07 23:00 ` Oleg Nesterov
2025-07-08 11:00 ` David Laight
2025-07-08 1:40 ` 答复: [????] " Li,Rongqing
1 sibling, 1 reply; 23+ messages in thread
From: Oleg Nesterov @ 2025-07-07 23:00 UTC (permalink / raw)
To: Steven Rostedt
Cc: Li,Rongqing, Peter Zijlstra, David Laight,
linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
vincent.guittot@linaro.org, juri.lelli@redhat.com,
mingo@redhat.com
On 07/07, Steven Rostedt wrote:
>
> On Mon, 7 Jul 2025 18:20:56 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
>
> > I would say this should never happen and if it does, let the kernel crash.
>
> >> [78250815.703852] CPU: 127 PID: 83435 Comm: killall Kdump: loaded Tainted: P OE K 5.10.0 #1
>
> This happened on a 5.10 kernel with a proprietary module loaded, so
> honestly, if it can't be reproduced on a newer kernel without any
> proprietary modules loaded, I say we don't worry about it.
Yes, agreed, see my reply to myself.
Oleg.
> I also don't by the utime + stime overflowing a 64bit number.
>
> 2^64 / 2 = 2^63 = 9223372036854775808
>
> That would be:
>
> minutes days
> v v
> 9223372036854775808 / 1000000000 / 60 / 60 / 24 / 365.25 = 292.27
> ^ ^ ^
> ns -> sec hours years
>
> So the report says they have threads running for a very long time, it would
> still be 292 years of run time!
>
> -- Steve
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* 答复: [????] Re: divide error in x86 and cputime
2025-07-07 22:30 ` Oleg Nesterov
@ 2025-07-07 23:41 ` Li,Rongqing
2025-07-07 23:53 ` Steven Rostedt
2025-07-08 0:23 ` 答复: " Li,Rongqing
0 siblings, 2 replies; 23+ messages in thread
From: Li,Rongqing @ 2025-07-07 23:41 UTC (permalink / raw)
To: Oleg Nesterov, Peter Zijlstra, David Laight
Cc: linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, rostedt@goodmis.org,
dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
juri.lelli@redhat.com, mingo@redhat.com
> On a second thought, this
>
> mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> 0x09e00900);
> stime rtime
> stime + utime
>
> looks suspicious:
>
> - stime > stime + utime
>
> - rtime = 0xfffd213aabd74626 is absurdly huge
>
> so perhaps there is another problem?
>
it happened when a process with 236 busy polling threads , run about 904 days, the total time will overflow the 64bit
non-x86 system maybe has same issue, once (stime + utime) overflows 64bit, mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division by 0
so to cputime, could cputime_adjust() return stime if stime if stime + utime is overflow
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 6dab4854..db0c273 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
goto update;
}
+ if (stime > (stime + utime)) {
+ goto update;
+ }
+
stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
/*
* Because mul_u64_u64_div_u64() can approximate on some
Thanks
-Li
> Oleg.
>
> On 07/08, Oleg Nesterov wrote:
> >
> > On 07/07, Li,Rongqing wrote:
> > >
> > > [78250815.703847] divide error: 0000 [#1] PREEMPT SMP NOPTI
> >
> > ...
> >
> > > It caused by a process with many threads running very long, and
> > > utime+stime overflowed 64bit, then cause the below div
> > >
> > > mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> > > 0x09e00900);
> > >
> > > I see the comments of mul_u64_u64_div_u64() say:
> > >
> > > Will generate an #DE when the result doesn't fit u64, could fix with
> > > an __ex_table[] entry when it becomes an issu
> > >
> > > Seem __ex_table[] entry for div does not work ?
> >
> > Well, the current version doesn't have an __ex_table[] entry for div...
> >
> > I do not know what can/should we do in this case... Perhaps
> >
> > static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
> > {
> > int ok = 0;
> > u64 q;
> >
> > asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
> > _ASM_EXTABLE(1b, 2b)
> > : "=a" (q), "+r" (ok)
> > : "a" (a), "rm" (mul), "rm" (div)
> > : "rdx");
> >
> > return ok ? q : -1ul;
> > }
> >
> > ?
> >
> > Should return ULLONG_MAX on #DE.
> >
> > Oleg.
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [????] Re: divide error in x86 and cputime
2025-07-07 23:41 ` 答复: [????] " Li,Rongqing
@ 2025-07-07 23:53 ` Steven Rostedt
2025-07-08 0:10 ` 答复: [????] " Li,Rongqing
2025-07-08 0:23 ` 答复: " Li,Rongqing
1 sibling, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2025-07-07 23:53 UTC (permalink / raw)
To: Li,Rongqing
Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
vincent.guittot@linaro.org, juri.lelli@redhat.com,
mingo@redhat.com
On Mon, 7 Jul 2025 23:41:14 +0000
"Li,Rongqing" <lirongqing@baidu.com> wrote:
> > On a second thought, this
> >
> > mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> > 0x09e00900);
> > stime rtime
> > stime + utime
> >
> > looks suspicious:
> >
> > - stime > stime + utime
> >
> > - rtime = 0xfffd213aabd74626 is absurdly huge
> >
> > so perhaps there is another problem?
> >
>
> it happened when a process with 236 busy polling threads , run about 904 days, the total time will overflow the 64bit
>
> non-x86 system maybe has same issue, once (stime + utime) overflows 64bit, mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division by 0
>
> so to cputime, could cputime_adjust() return stime if stime if stime + utime is overflow
>
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index 6dab4854..db0c273 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
> goto update;
> }
>
> + if (stime > (stime + utime)) {
> + goto update;
> + }
> +
> stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> /*
> * Because mul_u64_u64_div_u64() can approximate on some
>
Are you running 5.10.0? Because a diff of 5.10.238 from 5.10.0 gives:
@@ -579,6 +579,12 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
}
stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
+ /*
+ * Because mul_u64_u64_div_u64() can approximate on some
+ * achitectures; enforce the constraint that: a*b/(b+c) <= a.
+ */
+ if (unlikely(stime > rtime))
+ stime = rtime;
update:
Thus the result is what's getting screwed up.
-- Steve
^ permalink raw reply [flat|nested] 23+ messages in thread
* 答复: [????] Re: [????] Re: divide error in x86 and cputime
2025-07-07 23:53 ` Steven Rostedt
@ 2025-07-08 0:10 ` Li,Rongqing
2025-07-08 0:30 ` Steven Rostedt
0 siblings, 1 reply; 23+ messages in thread
From: Li,Rongqing @ 2025-07-08 0:10 UTC (permalink / raw)
To: Steven Rostedt
Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
vincent.guittot@linaro.org, juri.lelli@redhat.com,
mingo@redhat.com
> On Mon, 7 Jul 2025 23:41:14 +0000
> "Li,Rongqing" <lirongqing@baidu.com> wrote:
>
> > > On a second thought, this
> > >
> > > mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> > > 0x09e00900);
> > > stime rtime
> > > stime + utime
> > >
> > > looks suspicious:
> > >
> > > - stime > stime + utime
> > >
> > > - rtime = 0xfffd213aabd74626 is absurdly huge
> > >
> > > so perhaps there is another problem?
> > >
> >
> > it happened when a process with 236 busy polling threads , run about
> > 904 days, the total time will overflow the 64bit
> >
> > non-x86 system maybe has same issue, once (stime + utime) overflows
> > 64bit, mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division
> > by 0
> >
> > so to cputime, could cputime_adjust() return stime if stime if stime +
> > utime is overflow
> >
> > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index
> > 6dab4854..db0c273 100644
> > --- a/kernel/sched/cputime.c
> > +++ b/kernel/sched/cputime.c
> > @@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr,
> struct prev_cputime *prev,
> > goto update;
> > }
> >
> > + if (stime > (stime + utime)) {
> > + goto update;
> > + }
> > +
> > stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > /*
> > * Because mul_u64_u64_div_u64() can approximate on some
> >
>
> Are you running 5.10.0? Because a diff of 5.10.238 from 5.10.0 gives:
>
> @@ -579,6 +579,12 @@ void cputime_adjust(struct task_cputime *curr, struct
> prev_cputime *prev,
> }
>
> stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> + /*
> + * Because mul_u64_u64_div_u64() can approximate on some
> + * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> + */
> + if (unlikely(stime > rtime))
> + stime = rtime;
My 5.10 has not this patch " sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime ",
but I am sure this patch can not fix this overflow issue, Since division error happened in mul_u64_u64_div_u64()
Thanks
-Li
>
> update:
>
>
> Thus the result is what's getting screwed up.
>
> -- Steve
^ permalink raw reply [flat|nested] 23+ messages in thread
* 答复: [????] Re: divide error in x86 and cputime
2025-07-07 23:41 ` 答复: [????] " Li,Rongqing
2025-07-07 23:53 ` Steven Rostedt
@ 2025-07-08 0:23 ` Li,Rongqing
1 sibling, 0 replies; 23+ messages in thread
From: Li,Rongqing @ 2025-07-08 0:23 UTC (permalink / raw)
To: Oleg Nesterov, Peter Zijlstra, David Laight
Cc: linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, rostedt@goodmis.org,
dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
juri.lelli@redhat.com, mingo@redhat.com
> non-x86 system maybe has same issue, once (stime + utime) overflows 64bit,
> mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division by 0
>
Correct this, mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626, 0x009e00900) oflib/math/div64.c returns 0xffffffffffffffff
> so to cputime, could cputime_adjust() return stime if stime if stime + utime is
> overflow
>
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index
> 6dab4854..db0c273 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr, struct
> prev_cputime *prev,
> goto update;
> }
>
> + if (stime > (stime + utime)) {
> + goto update;
> + }
> +
> stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> /*
> * Because mul_u64_u64_div_u64() can approximate on some
>
>
> Thanks
>
> -Li
>
>
> > Oleg.
> >
> > On 07/08, Oleg Nesterov wrote:
> > >
> > > On 07/07, Li,Rongqing wrote:
> > > >
> > > > [78250815.703847] divide error: 0000 [#1] PREEMPT SMP NOPTI
> > >
> > > ...
> > >
> > > > It caused by a process with many threads running very long, and
> > > > utime+stime overflowed 64bit, then cause the below div
> > > >
> > > > mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> > > > 0x09e00900);
> > > >
> > > > I see the comments of mul_u64_u64_div_u64() say:
> > > >
> > > > Will generate an #DE when the result doesn't fit u64, could fix
> > > > with an __ex_table[] entry when it becomes an issu
> > > >
> > > > Seem __ex_table[] entry for div does not work ?
> > >
> > > Well, the current version doesn't have an __ex_table[] entry for div...
> > >
> > > I do not know what can/should we do in this case... Perhaps
> > >
> > > static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
> > > {
> > > int ok = 0;
> > > u64 q;
> > >
> > > asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
> > > _ASM_EXTABLE(1b, 2b)
> > > : "=a" (q), "+r" (ok)
> > > : "a" (a), "rm" (mul), "rm" (div)
> > > : "rdx");
> > >
> > > return ok ? q : -1ul;
> > > }
> > >
> > > ?
> > >
> > > Should return ULLONG_MAX on #DE.
> > >
> > > Oleg.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [????] Re: [????] Re: divide error in x86 and cputime
2025-07-08 0:10 ` 答复: [????] " Li,Rongqing
@ 2025-07-08 0:30 ` Steven Rostedt
2025-07-08 1:17 ` 答复: [????] " Li,Rongqing
2025-07-08 10:35 ` [????] Re: [????] " David Laight
0 siblings, 2 replies; 23+ messages in thread
From: Steven Rostedt @ 2025-07-08 0:30 UTC (permalink / raw)
To: Li,Rongqing
Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
vincent.guittot@linaro.org, juri.lelli@redhat.com,
mingo@redhat.com
On Tue, 8 Jul 2025 00:10:54 +0000
"Li,Rongqing" <lirongqing@baidu.com> wrote:
> > stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > + /*
> > + * Because mul_u64_u64_div_u64() can approximate on some
> > + * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> > + */
> > + if (unlikely(stime > rtime))
> > + stime = rtime;
>
>
> My 5.10 has not this patch " sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime ",
> but I am sure this patch can not fix this overflow issue, Since division error happened in mul_u64_u64_div_u64()
Have you tried it? Or are you just making an assumption?
How can you be so sure? Did you even *look* at the commit?
sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime
In extreme test scenarios:
the 14th field utime in /proc/xx/stat is greater than sum_exec_runtime,
utime = 18446744073709518790 ns, rtime = 135989749728000 ns
In cputime_adjust() process, stime is greater than rtime due to
mul_u64_u64_div_u64() precision problem.
before call mul_u64_u64_div_u64(),
stime = 175136586720000, rtime = 135989749728000, utime = 1416780000.
after call mul_u64_u64_div_u64(),
stime = 135989949653530
unsigned reversion occurs because rtime is less than stime.
utime = rtime - stime = 135989749728000 - 135989949653530
= -199925530
= (u64)18446744073709518790
Trigger condition:
1). User task run in kernel mode most of time
2). ARM64 architecture
3). TICK_CPU_ACCOUNTING=y
CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set
Fix mul_u64_u64_div_u64() conversion precision by reset stime to rtime
When stime ends up greater than rtime, it causes utime to go NEGATIVE!
That means *YES* it can overflow a u64 number. That's your bug.
Next time, look to see if there's fixes in the code that is triggering
issues for you and test them out, before bothering upstream.
Goodbye.
-- Steve
^ permalink raw reply [flat|nested] 23+ messages in thread
* 答复: [????] Re: [????] Re: [????] Re: divide error in x86 and cputime
2025-07-08 0:30 ` Steven Rostedt
@ 2025-07-08 1:17 ` Li,Rongqing
2025-07-08 1:41 ` Steven Rostedt
2025-07-08 10:35 ` [????] Re: [????] " David Laight
1 sibling, 1 reply; 23+ messages in thread
From: Li,Rongqing @ 2025-07-08 1:17 UTC (permalink / raw)
To: Steven Rostedt
Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
vincent.guittot@linaro.org, juri.lelli@redhat.com,
mingo@redhat.com
> Have you tried it? Or are you just making an assumption?
>
> How can you be so sure? Did you even *look* at the commit?
>
> sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime
>
> In extreme test scenarios:
> the 14th field utime in /proc/xx/stat is greater than sum_exec_runtime,
> utime = 18446744073709518790 ns, rtime = 135989749728000 ns
>
> In cputime_adjust() process, stime is greater than rtime due to
> mul_u64_u64_div_u64() precision problem.
> before call mul_u64_u64_div_u64(),
> stime = 175136586720000, rtime = 135989749728000, utime =
> 1416780000.
> after call mul_u64_u64_div_u64(),
> stime = 135989949653530
>
> unsigned reversion occurs because rtime is less than stime.
> utime = rtime - stime = 135989749728000 - 135989949653530
> = -199925530
> = (u64)18446744073709518790
>
I will try to tested this patch, But I think it is different case;
Stime is not greater than rtime in my case, (stime= 0x69f98da9ba980c00, rtime= 0xfffd213aabd74626, stime+utime= 0x9e00900. So utime should be 0x960672564f47fd00 ), and this overflow process with 236 busy poll threads running about 904 day, so I think these times are correct
Thanks
-Li
> Trigger condition:
> 1). User task run in kernel mode most of time
> 2). ARM64 architecture
> 3). TICK_CPU_ACCOUNTING=y
> CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set
>
> Fix mul_u64_u64_div_u64() conversion precision by reset stime to rtime
>
>
> When stime ends up greater than rtime, it causes utime to go NEGATIVE!
>
> That means *YES* it can overflow a u64 number. That's your bug.
>
> Next time, look to see if there's fixes in the code that is triggering issues for you
> and test them out, before bothering upstream.
>
> Goodbye.
>
> -- Steve
^ permalink raw reply [flat|nested] 23+ messages in thread
* 答复: [????] Re: divide error in x86 and cputime
2025-07-07 22:33 ` Steven Rostedt
2025-07-07 23:00 ` Oleg Nesterov
@ 2025-07-08 1:40 ` Li,Rongqing
2025-07-08 1:53 ` Steven Rostedt
1 sibling, 1 reply; 23+ messages in thread
From: Li,Rongqing @ 2025-07-08 1:40 UTC (permalink / raw)
To: Steven Rostedt, Oleg Nesterov
Cc: Peter Zijlstra, David Laight, linux-kernel@vger.kernel.org,
vschneid@redhat.com, mgorman@suse.de, bsegall@google.com,
dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
juri.lelli@redhat.com, mingo@redhat.com
> That would be:
>
> minutes days
> v v
> 9223372036854775808 / 1000000000 / 60 / 60 / 24 / 365.25 = 292.27
> ^ ^ ^
> ns -> sec hours years
>
> So the report says they have threads running for a very long time, it would still
> be 292 years of run time!
Utime/rtime is u64, it means overflow needs 292.27*2=584 year,
But with multiple thread, like 292 threads, it only need two years, it is a thread group total running time
void thread_group_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
{
struct task_cputime cputime;
thread_group_cputime(p, &cputime);
cputime_adjust(&cputime, &p->signal->prev_cputime, ut, st);
}
-Li
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: divide error in x86 and cputime
2025-07-08 1:17 ` 答复: [????] " Li,Rongqing
@ 2025-07-08 1:41 ` Steven Rostedt
0 siblings, 0 replies; 23+ messages in thread
From: Steven Rostedt @ 2025-07-08 1:41 UTC (permalink / raw)
To: Li,Rongqing
Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
vincent.guittot@linaro.org, juri.lelli@redhat.com,
mingo@redhat.com
On Tue, 8 Jul 2025 01:17:50 +0000
"Li,Rongqing" <lirongqing@baidu.com> wrote:
> Stime is not greater than rtime in my case, (stime= 0x69f98da9ba980c00,
> rtime= 0xfffd213aabd74626, stime+utime= 0x9e00900. So utime should be
> 0x960672564f47fd00 ), and this overflow process with 236 busy poll
> threads running about 904 day, so I think these times are correct
>
But look at rtime, it is *negative*. So maybe that fix isn't going to fix
this bug, but rtime is most definitely screwed up. That value is:
0xfffd213aabd74626 = (u64)18445936184654251558 = (s64)-807889055300058
There's no way run time should be 584 years in nanoseconds.
So if it's not fixed by that commit, it's a bug that happened before you even
got to the mul_u64_u64_div_u64() function. Touching that is only putting a
band-aid on the symptom, you haven't touched the real bug.
I bet there's likely another fix between what you are using and 5.10.238.
There's 31,101 commits between those two. You are using a way old kernel
without any fixes to it. It is known to be buggy. You will hit bugs with
it. No need to tell us about it.
-- Steve
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [????] Re: divide error in x86 and cputime
2025-07-08 1:40 ` 答复: [????] " Li,Rongqing
@ 2025-07-08 1:53 ` Steven Rostedt
2025-07-08 1:58 ` 答复: [????] " Li,Rongqing
0 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2025-07-08 1:53 UTC (permalink / raw)
To: Li,Rongqing
Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
vincent.guittot@linaro.org, juri.lelli@redhat.com,
mingo@redhat.com
On Tue, 8 Jul 2025 01:40:27 +0000
"Li,Rongqing" <lirongqing@baidu.com> wrote:
> > That would be:
> >
> > minutes days
> > v v
> > 9223372036854775808 / 1000000000 / 60 / 60 / 24 / 365.25 = 292.27
> > ^ ^ ^
> > ns -> sec hours years
> >
> > So the report says they have threads running for a very long time, it would still
> > be 292 years of run time!
>
> Utime/rtime is u64, it means overflow needs 292.27*2=584 year,
>
> But with multiple thread, like 292 threads, it only need two years, it is a thread group total running time
>
>
> void thread_group_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
> {
> struct task_cputime cputime;
>
> thread_group_cputime(p, &cputime);
> cputime_adjust(&cputime, &p->signal->prev_cputime, ut, st);
> }
>
So you are saying that you have been running this for over two years
without a reboot?
Then the issue isn't the divider, it's that the thread group cputime can
overflow. Perhaps it needs a cap, or a way to "reset" somehow after "so long"?
-- Steve
^ permalink raw reply [flat|nested] 23+ messages in thread
* 答复: [????] Re: [????] Re: divide error in x86 and cputime
2025-07-08 1:53 ` Steven Rostedt
@ 2025-07-08 1:58 ` Li,Rongqing
2025-07-08 2:05 ` Steven Rostedt
0 siblings, 1 reply; 23+ messages in thread
From: Li,Rongqing @ 2025-07-08 1:58 UTC (permalink / raw)
To: Steven Rostedt
Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
vincent.guittot@linaro.org, juri.lelli@redhat.com,
mingo@redhat.com
> "Li,Rongqing" <lirongqing@baidu.com> wrote:
>
> > > That would be:
> > >
> > > minutes days
> > > v v
> > > 9223372036854775808 / 1000000000 / 60 / 60 / 24 / 365.25 = 292.27
> > > ^ ^ ^
> > > ns -> sec hours years
> > >
> > > So the report says they have threads running for a very long time,
> > > it would still be 292 years of run time!
> >
> > Utime/rtime is u64, it means overflow needs 292.27*2=584 year,
> >
> > But with multiple thread, like 292 threads, it only need two years, it
> > is a thread group total running time
> >
> >
> > void thread_group_cputime_adjusted(struct task_struct *p, u64 *ut, u64
> > *st) {
> > struct task_cputime cputime;
> >
> > thread_group_cputime(p, &cputime);
> > cputime_adjust(&cputime, &p->signal->prev_cputime, ut, st); }
> >
>
> So you are saying that you have been running this for over two years without a
> reboot?
>
Yes, Consider more and more CPUs in machine, I think it is common case
> Then the issue isn't the divider, it's that the thread group cputime can overflow.
> Perhaps it needs a cap, or a way to "reset" somehow after "so long"?
Do not clear how to reset
But mul_u64_u64_div_u64() for x86 should not trigger a division error panic, maybe should return a ULLONG_MAX on #DE (like non-x86 mul_u64_u64_div_u64(),)
>
> -- Steve
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [????] Re: [????] Re: divide error in x86 and cputime
2025-07-08 1:58 ` 答复: [????] " Li,Rongqing
@ 2025-07-08 2:05 ` Steven Rostedt
2025-07-08 2:17 ` Oleg Nesterov
0 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2025-07-08 2:05 UTC (permalink / raw)
To: Li,Rongqing
Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
vincent.guittot@linaro.org, juri.lelli@redhat.com,
mingo@redhat.com
On Tue, 8 Jul 2025 01:58:00 +0000
"Li,Rongqing" <lirongqing@baidu.com> wrote:
> But mul_u64_u64_div_u64() for x86 should not trigger a division error panic, maybe should return a ULLONG_MAX on #DE (like non-x86 mul_u64_u64_div_u64(),)
Perhaps. But it is still producing garbage.
-- Steve
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [????] Re: [????] Re: divide error in x86 and cputime
2025-07-08 2:05 ` Steven Rostedt
@ 2025-07-08 2:17 ` Oleg Nesterov
2025-07-08 9:58 ` David Laight
0 siblings, 1 reply; 23+ messages in thread
From: Oleg Nesterov @ 2025-07-08 2:17 UTC (permalink / raw)
To: Steven Rostedt
Cc: Li,Rongqing, Peter Zijlstra, David Laight,
linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
vincent.guittot@linaro.org, juri.lelli@redhat.com,
mingo@redhat.com
On 07/07, Steven Rostedt wrote:
>
> On Tue, 8 Jul 2025 01:58:00 +0000
> "Li,Rongqing" <lirongqing@baidu.com> wrote:
>
> > But mul_u64_u64_div_u64() for x86 should not trigger a division error panic,
> maybe should return a ULLONG_MAX on #DE (like non-x86 mul_u64_u64_div_u64(),)
>
> Perhaps.
So do you think
static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
{
int ok = 0;
u64 q;
asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
_ASM_EXTABLE(1b, 2b)
: "=a" (q), "+r" (ok)
: "a" (a), "rm" (mul), "rm" (div)
: "rdx");
return ok ? q : -1ul;
}
makes sense at least for consistency with the generic implementation
in lib/math/div64.c ?
> But it is still producing garbage.
Agreed. And not a solution to this particular problem.
Oleg.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [????] Re: [????] Re: divide error in x86 and cputime
2025-07-08 2:17 ` Oleg Nesterov
@ 2025-07-08 9:58 ` David Laight
0 siblings, 0 replies; 23+ messages in thread
From: David Laight @ 2025-07-08 9:58 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Steven Rostedt, Li,Rongqing, Peter Zijlstra,
linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
vincent.guittot@linaro.org, juri.lelli@redhat.com,
mingo@redhat.com
On Tue, 8 Jul 2025 04:17:04 +0200
Oleg Nesterov <oleg@redhat.com> wrote:
> On 07/07, Steven Rostedt wrote:
> >
> > On Tue, 8 Jul 2025 01:58:00 +0000
> > "Li,Rongqing" <lirongqing@baidu.com> wrote:
> >
> > > But mul_u64_u64_div_u64() for x86 should not trigger a division error panic,
> > maybe should return a ULLONG_MAX on #DE (like non-x86 mul_u64_u64_div_u64(),)
> >
> > Perhaps.
>
> So do you think
>
> static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
> {
> int ok = 0;
> u64 q;
>
> asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
> _ASM_EXTABLE(1b, 2b)
> : "=a" (q), "+r" (ok)
> : "a" (a), "rm" (mul), "rm" (div)
> : "rdx");
>
> return ok ? q : -1ul;
You need to decide what to return/do when 'div' is zero.
So perhaps:
if (ok)
return q;
BUG_ON(!div);
return ~(u64)0;
But maybe 0/0 should return 0.
> }
>
> makes sense at least for consistency with the generic implementation
> in lib/math/div64.c ?
I don't like the way the current version handles divide by zero at all.
Even forcing the cpu to execute a 'divide by zero' doesn't seem right.
The result should be well defined (and useful).
It might even be worth adding an extra parameter to report overflow
and return ~0 for overflow and 0 for divide by zero (I think that is
less likely to cause grief in the following instructions).
That does 'pass the buck' to the caller.
>
> > But it is still producing garbage.
>
> Agreed. And not a solution to this particular problem.
Using mul_u64_u_64_div_u64() here is also horribly expensive for a
simple split between (IIRC) utime and stime.
It isn't too bad on x86-64, but everywhere else it is horrid.
For 'random' values the code hits 900 clocks on x86-32 - and that
is in userspace with cmov and %ebp as a general register.
My new version is ~230 for x86-32 and ~130 for x86-64 (not doing
the fast asm) on ivy bridge, ~80 for x86-64 on zen5.
(I'm on holiday and have limited systems available.)
David
>
> Oleg.
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [????] Re: [????] Re: divide error in x86 and cputime
2025-07-08 0:30 ` Steven Rostedt
2025-07-08 1:17 ` 答复: [????] " Li,Rongqing
@ 2025-07-08 10:35 ` David Laight
2025-07-08 11:12 ` 答复: [????] " Li,Rongqing
1 sibling, 1 reply; 23+ messages in thread
From: David Laight @ 2025-07-08 10:35 UTC (permalink / raw)
To: Steven Rostedt
Cc: Li,Rongqing, Oleg Nesterov, Peter Zijlstra,
linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
vincent.guittot@linaro.org, juri.lelli@redhat.com,
mingo@redhat.com
On Mon, 7 Jul 2025 20:30:57 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 8 Jul 2025 00:10:54 +0000
> "Li,Rongqing" <lirongqing@baidu.com> wrote:
>
> > > stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > > + /*
> > > + * Because mul_u64_u64_div_u64() can approximate on some
> > > + * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> > > + */
> > > + if (unlikely(stime > rtime))
> > > + stime = rtime;
> >
> >
> > My 5.10 has not this patch " sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime ",
> > but I am sure this patch can not fix this overflow issue, Since division error happened in mul_u64_u64_div_u64()
>
> Have you tried it? Or are you just making an assumption?
>
> How can you be so sure? Did you even *look* at the commit?
It can't be relevant.
That change is after the mul_u64_u64_div_u64() call that trapped.
It is also not relevant for x86-64 because it uses the asm version.
At some point mul_u64_u64_div_u64() got changed to be accurate (and slow)
so that check isn't needed any more.
David
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: divide error in x86 and cputime
2025-07-07 23:00 ` Oleg Nesterov
@ 2025-07-08 11:00 ` David Laight
0 siblings, 0 replies; 23+ messages in thread
From: David Laight @ 2025-07-08 11:00 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Steven Rostedt, Li,Rongqing, Peter Zijlstra,
linux-kernel@vger.kernel.org, vschneid@redhat.com,
mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
vincent.guittot@linaro.org, juri.lelli@redhat.com,
mingo@redhat.com
On Tue, 8 Jul 2025 01:00:57 +0200
Oleg Nesterov <oleg@redhat.com> wrote:
> On 07/07, Steven Rostedt wrote:
> >
> > On Mon, 7 Jul 2025 18:20:56 -0400
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > > I would say this should never happen and if it does, let the kernel crash.
> >
> > >> [78250815.703852] CPU: 127 PID: 83435 Comm: killall Kdump: loaded Tainted: P OE K 5.10.0 #1
> >
> > This happened on a 5.10 kernel with a proprietary module loaded, so
> > honestly, if it can't be reproduced on a newer kernel without any
> > proprietary modules loaded, I say we don't worry about it.
>
> Yes, agreed, see my reply to myself.
Except that just isn't relevant.
The problem is that the process running time (across all threads) can
easily exceed 2^64 nanoseconds.
With cpu having more and more 'cores' and software spinning to reduce
latency it will get more and more common.
Perhaps standardising on ns for timers (etc) wasn't such a bright idea.
Maybe 100ns would have been better.
But the process 'rtime' does need dividing down somewhat.
Thread 'rtime' is fine - 564 years isn't going to be out problem!
David
^ permalink raw reply [flat|nested] 23+ messages in thread
* 答复: [????] Re: [????] Re: [????] Re: divide error in x86 and cputime
2025-07-08 10:35 ` [????] Re: [????] " David Laight
@ 2025-07-08 11:12 ` Li,Rongqing
0 siblings, 0 replies; 23+ messages in thread
From: Li,Rongqing @ 2025-07-08 11:12 UTC (permalink / raw)
To: David Laight, Steven Rostedt
Cc: Oleg Nesterov, Peter Zijlstra, linux-kernel@vger.kernel.org,
vschneid@redhat.com, mgorman@suse.de, bsegall@google.com,
dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
juri.lelli@redhat.com, mingo@redhat.com
> > On Tue, 8 Jul 2025 00:10:54 +0000
> > "Li,Rongqing" <lirongqing@baidu.com> wrote:
> >
> > > > stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > > > + /*
> > > > + * Because mul_u64_u64_div_u64() can approximate on some
> > > > + * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> > > > + */
> > > > + if (unlikely(stime > rtime))
> > > > + stime = rtime;
> > >
> > >
> > > My 5.10 has not this patch " sched/cputime: Fix
> > > mul_u64_u64_div_u64() precision for cputime ", but I am sure this
> > > patch can not fix this overflow issue, Since division error happened
> > > in mul_u64_u64_div_u64()
> >
> > Have you tried it? Or are you just making an assumption?
> >
> > How can you be so sure? Did you even *look* at the commit?
>
> It can't be relevant.
> That change is after the mul_u64_u64_div_u64() call that trapped.
> It is also not relevant for x86-64 because it uses the asm version.
>
> At some point mul_u64_u64_div_u64() got changed to be accurate (and slow) so
> that check isn't needed any more.
>
I see this patch not relevant
Thank you very much for your confirmation
-Li
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2025-07-08 11:14 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-07 8:14 divide error in x86 and cputime Li,Rongqing
2025-07-07 15:11 ` Steven Rostedt
2025-07-07 22:09 ` Oleg Nesterov
2025-07-07 22:20 ` Steven Rostedt
2025-07-07 22:33 ` Steven Rostedt
2025-07-07 23:00 ` Oleg Nesterov
2025-07-08 11:00 ` David Laight
2025-07-08 1:40 ` 答复: [????] " Li,Rongqing
2025-07-08 1:53 ` Steven Rostedt
2025-07-08 1:58 ` 答复: [????] " Li,Rongqing
2025-07-08 2:05 ` Steven Rostedt
2025-07-08 2:17 ` Oleg Nesterov
2025-07-08 9:58 ` David Laight
2025-07-07 22:30 ` Oleg Nesterov
2025-07-07 23:41 ` 答复: [????] " Li,Rongqing
2025-07-07 23:53 ` Steven Rostedt
2025-07-08 0:10 ` 答复: [????] " Li,Rongqing
2025-07-08 0:30 ` Steven Rostedt
2025-07-08 1:17 ` 答复: [????] " Li,Rongqing
2025-07-08 1:41 ` Steven Rostedt
2025-07-08 10:35 ` [????] Re: [????] " David Laight
2025-07-08 11:12 ` 答复: [????] " Li,Rongqing
2025-07-08 0:23 ` 答复: " Li,Rongqing
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).