divide error in x86 and cputime

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* divide error in x86 and cputime
@ 2025-07-07  8:14 Li,Rongqing
  2025-07-07 15:11 ` Steven Rostedt
  2025-07-07 22:09 ` Oleg Nesterov
  0 siblings, 2 replies; 23+ messages in thread
From: Li,Rongqing @ 2025-07-07  8:14 UTC (permalink / raw)
  To: oleg@redhat.com
  Cc: linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, rostedt@goodmis.org,
	dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
	juri.lelli@redhat.com, peterz@infradead.org, mingo@redhat.com

Hi:

I see a divide error on x86 machine, the stack is below:


[78250815.703847] divide error: 0000 [#1] PREEMPT SMP NOPTI
[78250815.703852] CPU: 127 PID: 83435 Comm: killall Kdump: loaded Tainted: P           OE K   5.10.0 #1
[78250815.703853] Hardware name: Inspur SSINSPURMBX-XA3-100D-B356/NF5280A6, BIOS 3.00.21 06/27/2022
[78250815.703859] RIP: 0010:cputime_adjust+0x55/0xb0
[78250815.703860] Code: 3b 4c 8b 4d 10 48 89 c6 49 8d 04 38 4c 39 c8 73 38 48 8b 45 00 48 8b 55 08 48 85 c0 74 16 48 85 d2 74 49 48 8d 0c 10 49 f7 e1 <48> f7 f1 49 39 c0 4c 0f 42 c0 4c 89 c8 4c 29 c0 48 39 c7 77 25 48
[78250815.703861] RSP: 0018:ffffa34c2517bc40 EFLAGS: 00010887
[78250815.703864] RAX: 69f98da9ba980c00 RBX: ffff976c93d2a5e0 RCX: 0000000709e00900
[78250815.703864] RDX: 00f5dfffab0fc352 RSI: 0000000000000082 RDI: ff07410dca0bcd5e
[78250815.703865] RBP: ffffa34c2517bc70 R08: 00f5dfff54f8e5ce R09: fffd213aabd74626
[78250815.703866] R10: ffffa34c2517bed8 R11: 0000000000000000 R12: ffff976c93d2a5f0
[78250815.703867] R13: ffffa34c2517bd78 R14: ffffa34c2517bd70 R15: 0000000000001000
[78250815.703868] FS:  00007f58060f97a0(0000) GS:ffff976afe9c0000(0000) knlGS:0000000000000000
[78250815.703869] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[78250815.703870] CR2: 00007f580610e000 CR3: 0000017e3b3d2004 CR4: 0000000000770ee0
[78250815.703870] PKRU: 55555554
[78250815.703871] Call Trace:
[78250815.703877]  thread_group_cputime_adjusted+0x4a/0x70
[78250815.703881]  do_task_stat+0x2ed/0xe00
[78250815.703885]  ? khugepaged_enter_vma_merge+0x12/0xd0
[78250815.703888]  proc_single_show+0x51/0xc0
[78250815.703892]  seq_read_iter+0x185/0x3c0
[78250815.703895]  seq_read+0x106/0x150
[78250815.703898]  vfs_read+0x98/0x180
[78250815.703900]  ksys_read+0x59/0xd0
[78250815.703904]  do_syscall_64+0x33/0x40
[78250815.703907]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[78250815.703910] RIP: 0033:0x318aeda360


It caused by a process with many threads running very long, and utime+stime overflowed 64bit, then cause the below div

mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626, 0x09e00900);

I see the comments of mul_u64_u64_div_u64() say:

Will generate an #DE when the result doesn't fit u64, could fix with an
__ex_table[] entry when it becomes an issu


Seem __ex_table[] entry for div does not work ?

Thanks

-Li




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: divide error in x86 and cputime
  2025-07-07  8:14 divide error in x86 and cputime Li,Rongqing
@ 2025-07-07 15:11 ` Steven Rostedt
  2025-07-07 22:09 ` Oleg Nesterov
  1 sibling, 0 replies; 23+ messages in thread
From: Steven Rostedt @ 2025-07-07 15:11 UTC (permalink / raw)
  To: Li,Rongqing
  Cc: oleg@redhat.com, linux-kernel@vger.kernel.org,
	vschneid@redhat.com, mgorman@suse.de, bsegall@google.com,
	dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
	juri.lelli@redhat.com, peterz@infradead.org, mingo@redhat.com

On Mon, 7 Jul 2025 08:14:41 +0000
"Li,Rongqing" <lirongqing@baidu.com> wrote:

> Hi:
> 
> I see a divide error on x86 machine, the stack is below:
> 
> 
> [78250815.703847] divide error: 0000 [#1] PREEMPT SMP NOPTI
> [78250815.703852] CPU: 127 PID: 83435 Comm: killall Kdump: loaded Tainted: P           OE K   5.10.0 #1

Did you see this on a 5.10 kernel?

Do you see it on something more recent? Preferably the 6.15 or 6.16.

-- Steve

> [78250815.703853] Hardware name: Inspur SSINSPURMBX-XA3-100D-B356/NF5280A6, BIOS 3.00.21 06/27/2022
> [78250815.703859] RIP: 0010:cputime_adjust+0x55/0xb0
> [78250815.703860] Code: 3b 4c 8b 4d 10 48 89 c6 49 8d 04 38 4c 39 c8 73 38 48 8b 45 00 48 8b 55 08 48 85 c0 74 16 48 85 d2 74 49 48 8d 0c 10 49 f7 e1 <48> f7 f1 49 39 c0 4c 0f 42 c0 4c 89 c8 4c 29 c0 48 39 c7 77 25 48
> [78250815.703861] RSP: 0018:ffffa34c2517bc40 EFLAGS: 00010887
> [78250815.703864] RAX: 69f98da9ba980c00 RBX: ffff976c93d2a5e0 RCX: 0000000709e00900
> [78250815.703864] RDX: 00f5dfffab0fc352 RSI: 0000000000000082 RDI: ff07410dca0bcd5e
> [78250815.703865] RBP: ffffa34c2517bc70 R08: 00f5dfff54f8e5ce R09: fffd213aabd74626
> [78250815.703866] R10: ffffa34c2517bed8 R11: 0000000000000000 R12: ffff976c93d2a5f0
> [78250815.703867] R13: ffffa34c2517bd78 R14: ffffa34c2517bd70 R15: 0000000000001000
> [78250815.703868] FS:  00007f58060f97a0(0000) GS:ffff976afe9c0000(0000) knlGS:0000000000000000
> [78250815.703869] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [78250815.703870] CR2: 00007f580610e000 CR3: 0000017e3b3d2004 CR4: 0000000000770ee0
> [78250815.703870] PKRU: 55555554
> [78250815.703871] Call Trace:
> [78250815.703877]  thread_group_cputime_adjusted+0x4a/0x70
> [78250815.703881]  do_task_stat+0x2ed/0xe00
> [78250815.703885]  ? khugepaged_enter_vma_merge+0x12/0xd0
> [78250815.703888]  proc_single_show+0x51/0xc0
> [78250815.703892]  seq_read_iter+0x185/0x3c0
> [78250815.703895]  seq_read+0x106/0x150
> [78250815.703898]  vfs_read+0x98/0x180
> [78250815.703900]  ksys_read+0x59/0xd0
> [78250815.703904]  do_syscall_64+0x33/0x40
> [78250815.703907]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [78250815.703910] RIP: 0033:0x318aeda360
> 
> 
> It caused by a process with many threads running very long, and utime+stime overflowed 64bit, then cause the below div
> 
> mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626, 0x09e00900);
> 
> I see the comments of mul_u64_u64_div_u64() say:
> 
> Will generate an #DE when the result doesn't fit u64, could fix with an
> __ex_table[] entry when it becomes an issu
> 
> 
> Seem __ex_table[] entry for div does not work ?
> 
> Thanks
> 
> -Li
> 
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: divide error in x86 and cputime
  2025-07-07  8:14 divide error in x86 and cputime Li,Rongqing
  2025-07-07 15:11 ` Steven Rostedt
@ 2025-07-07 22:09 ` Oleg Nesterov
  2025-07-07 22:20   ` Steven Rostedt
  2025-07-07 22:30   ` Oleg Nesterov
  1 sibling, 2 replies; 23+ messages in thread
From: Oleg Nesterov @ 2025-07-07 22:09 UTC (permalink / raw)
  To: Li,Rongqing, Peter Zijlstra, David Laight
  Cc: linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, rostedt@goodmis.org,
	dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
	juri.lelli@redhat.com, mingo@redhat.com

On 07/07, Li,Rongqing wrote:
>
> [78250815.703847] divide error: 0000 [#1] PREEMPT SMP NOPTI

...

> It caused by a process with many threads running very long,
> and utime+stime overflowed 64bit, then cause the below div
>
> mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626, 0x09e00900);
>
> I see the comments of mul_u64_u64_div_u64() say:
>
> Will generate an #DE when the result doesn't fit u64, could fix with an
> __ex_table[] entry when it becomes an issu
>
> Seem __ex_table[] entry for div does not work ?

Well, the current version doesn't have an __ex_table[] entry for div...

I do not know what can/should we do in this case... Perhaps

	static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
	{
		int ok = 0;
		u64 q;

		asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
			_ASM_EXTABLE(1b, 2b)
			: "=a" (q), "+r" (ok)
			: "a" (a), "rm" (mul), "rm" (div)
			: "rdx");

		return ok ? q : -1ul;
	}

?

Should return ULLONG_MAX on #DE.

Oleg.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: divide error in x86 and cputime
  2025-07-07 22:09 ` Oleg Nesterov
@ 2025-07-07 22:20   ` Steven Rostedt
  2025-07-07 22:33     ` Steven Rostedt
  2025-07-07 22:30   ` Oleg Nesterov
  1 sibling, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2025-07-07 22:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Li,Rongqing, Peter Zijlstra, David Laight,
	linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, juri.lelli@redhat.com,
	mingo@redhat.com

On Tue, 8 Jul 2025 00:09:38 +0200
Oleg Nesterov <oleg@redhat.com> wrote:

> Well, the current version doesn't have an __ex_table[] entry for div...
> 
> I do not know what can/should we do in this case... Perhaps
> 
> 	static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
> 	{
> 		int ok = 0;
> 		u64 q;
> 
> 		asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
> 			_ASM_EXTABLE(1b, 2b)
> 			: "=a" (q), "+r" (ok)
> 			: "a" (a), "rm" (mul), "rm" (div)
> 			: "rdx");
> 
> 		return ok ? q : -1ul;
> 	}
> 
> ?
> 
> Should return ULLONG_MAX on #DE.

I would say this should never happen and if it does, let the kernel crash.

-- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: divide error in x86 and cputime
  2025-07-07 22:20   ` Steven Rostedt
@ 2025-07-07 22:33     ` Steven Rostedt
  2025-07-07 23:00       ` Oleg Nesterov
  2025-07-08  1:40       ` 答复: [????] " Li,Rongqing
  0 siblings, 2 replies; 23+ messages in thread
From: Steven Rostedt @ 2025-07-07 22:33 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Li,Rongqing, Peter Zijlstra, David Laight,
	linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, juri.lelli@redhat.com,
	mingo@redhat.com

On Mon, 7 Jul 2025 18:20:56 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> I would say this should never happen and if it does, let the kernel crash.

>> [78250815.703852] CPU: 127 PID: 83435 Comm: killall Kdump: loaded Tainted: P           OE K   5.10.0 #1

This happened on a 5.10 kernel with a proprietary module loaded, so
honestly, if it can't be reproduced on a newer kernel without any
proprietary modules loaded, I say we don't worry about it.

I also don't by the utime + stime overflowing a 64bit number.

  2^64 / 2 = 2^63 = 9223372036854775808

That would be:

                                   minutes    days
                                      v        v
  9223372036854775808 / 1000000000 / 60 / 60 / 24 / 365.25 = 292.27
                           ^               ^         ^
                        ns -> sec       hours       years

So the report says they have threads running for a very long time, it would
still be 292 years of run time!

-- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: divide error in x86 and cputime
  2025-07-07 22:33     ` Steven Rostedt
@ 2025-07-07 23:00       ` Oleg Nesterov
  2025-07-08 11:00         ` David Laight
  2025-07-08  1:40       ` 答复: [????] " Li,Rongqing
  1 sibling, 1 reply; 23+ messages in thread
From: Oleg Nesterov @ 2025-07-07 23:00 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Li,Rongqing, Peter Zijlstra, David Laight,
	linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, juri.lelli@redhat.com,
	mingo@redhat.com

On 07/07, Steven Rostedt wrote:
>
> On Mon, 7 Jul 2025 18:20:56 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
>
> > I would say this should never happen and if it does, let the kernel crash.
>
> >> [78250815.703852] CPU: 127 PID: 83435 Comm: killall Kdump: loaded Tainted: P           OE K   5.10.0 #1
>
> This happened on a 5.10 kernel with a proprietary module loaded, so
> honestly, if it can't be reproduced on a newer kernel without any
> proprietary modules loaded, I say we don't worry about it.

Yes, agreed, see my reply to myself.

Oleg.

> I also don't by the utime + stime overflowing a 64bit number.
>
>   2^64 / 2 = 2^63 = 9223372036854775808
>
> That would be:
>
>                                    minutes    days
>                                       v        v
>   9223372036854775808 / 1000000000 / 60 / 60 / 24 / 365.25 = 292.27
>                            ^               ^         ^
>                         ns -> sec       hours       years
>
> So the report says they have threads running for a very long time, it would
> still be 292 years of run time!
>
> -- Steve
>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: divide error in x86 and cputime
  2025-07-07 23:00       ` Oleg Nesterov
@ 2025-07-08 11:00         ` David Laight
  0 siblings, 0 replies; 23+ messages in thread
From: David Laight @ 2025-07-08 11:00 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Steven Rostedt, Li,Rongqing, Peter Zijlstra,
	linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, juri.lelli@redhat.com,
	mingo@redhat.com

On Tue, 8 Jul 2025 01:00:57 +0200
Oleg Nesterov <oleg@redhat.com> wrote:

> On 07/07, Steven Rostedt wrote:
> >
> > On Mon, 7 Jul 2025 18:20:56 -0400
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> >  
> > > I would say this should never happen and if it does, let the kernel crash.  
> >  
> > >> [78250815.703852] CPU: 127 PID: 83435 Comm: killall Kdump: loaded Tainted: P           OE K   5.10.0 #1  
> >
> > This happened on a 5.10 kernel with a proprietary module loaded, so
> > honestly, if it can't be reproduced on a newer kernel without any
> > proprietary modules loaded, I say we don't worry about it.  
> 
> Yes, agreed, see my reply to myself.

Except that just isn't relevant.
The problem is that the process running time (across all threads) can
easily exceed 2^64 nanoseconds.

With cpu having more and more 'cores' and software spinning to reduce
latency it will get more and more common.

Perhaps standardising on ns for timers (etc) wasn't such a bright idea.
Maybe 100ns would have been better.

But the process 'rtime' does need dividing down somewhat.
Thread 'rtime' is fine - 564 years isn't going to be out problem!

	David
  


^ permalink raw reply	[flat|nested] 23+ messages in thread

* 答复: [????] Re: divide error in x86 and cputime
  2025-07-07 22:33     ` Steven Rostedt
  2025-07-07 23:00       ` Oleg Nesterov
@ 2025-07-08  1:40       ` Li,Rongqing
  2025-07-08  1:53         ` Steven Rostedt
  1 sibling, 1 reply; 23+ messages in thread
From: Li,Rongqing @ 2025-07-08  1:40 UTC (permalink / raw)
  To: Steven Rostedt, Oleg Nesterov
  Cc: Peter Zijlstra, David Laight, linux-kernel@vger.kernel.org,
	vschneid@redhat.com, mgorman@suse.de, bsegall@google.com,
	dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
	juri.lelli@redhat.com, mingo@redhat.com

> That would be:
> 
>                                    minutes    days
>                                       v        v
>   9223372036854775808 / 1000000000 / 60 / 60 / 24 / 365.25 = 292.27
>                            ^               ^         ^
>                         ns -> sec       hours       years
> 
> So the report says they have threads running for a very long time, it would still
> be 292 years of run time!

Utime/rtime is u64, it means overflow needs 292.27*2=584 year,

But with multiple thread, like 292 threads, it only need two years, it is a thread group total running time


void thread_group_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
{
    struct task_cputime cputime;

    thread_group_cputime(p, &cputime);
    cputime_adjust(&cputime, &p->signal->prev_cputime, ut, st);
}

-Li




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [????] Re: divide error in x86 and cputime
  2025-07-08  1:40       ` 答复: [????] " Li,Rongqing
@ 2025-07-08  1:53         ` Steven Rostedt
  2025-07-08  1:58           ` 答复: [????] " Li,Rongqing
  0 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2025-07-08  1:53 UTC (permalink / raw)
  To: Li,Rongqing
  Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
	linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, juri.lelli@redhat.com,
	mingo@redhat.com

On Tue, 8 Jul 2025 01:40:27 +0000
"Li,Rongqing" <lirongqing@baidu.com> wrote:

> > That would be:
> > 
> >                                    minutes    days
> >                                       v        v
> >   9223372036854775808 / 1000000000 / 60 / 60 / 24 / 365.25 = 292.27
> >                            ^               ^         ^
> >                         ns -> sec       hours       years
> > 
> > So the report says they have threads running for a very long time, it would still
> > be 292 years of run time!  
> 
> Utime/rtime is u64, it means overflow needs 292.27*2=584 year,
> 
> But with multiple thread, like 292 threads, it only need two years, it is a thread group total running time
> 
> 
> void thread_group_cputime_adjusted(struct task_struct *p, u64 *ut, u64 *st)
> {
>     struct task_cputime cputime;
> 
>     thread_group_cputime(p, &cputime);
>     cputime_adjust(&cputime, &p->signal->prev_cputime, ut, st);
> }
> 

So you are saying that you have been running this for over two years
without a reboot?

Then the issue isn't the divider, it's that the thread group cputime can
overflow. Perhaps it needs a cap, or a way to "reset" somehow after "so long"?

-- Steve


^ permalink raw reply	[flat|nested] 23+ messages in thread

* 答复: [????] Re: [????] Re: divide error in x86 and cputime
  2025-07-08  1:53         ` Steven Rostedt
@ 2025-07-08  1:58           ` Li,Rongqing
  2025-07-08  2:05             ` Steven Rostedt
  0 siblings, 1 reply; 23+ messages in thread
From: Li,Rongqing @ 2025-07-08  1:58 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
	linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, juri.lelli@redhat.com,
	mingo@redhat.com

> "Li,Rongqing" <lirongqing@baidu.com> wrote:
> 
> > > That would be:
> > >
> > >                                    minutes    days
> > >                                       v        v
> > >   9223372036854775808 / 1000000000 / 60 / 60 / 24 / 365.25 = 292.27
> > >                            ^               ^         ^
> > >                         ns -> sec       hours       years
> > >
> > > So the report says they have threads running for a very long time,
> > > it would still be 292 years of run time!
> >
> > Utime/rtime is u64, it means overflow needs 292.27*2=584 year,
> >
> > But with multiple thread, like 292 threads, it only need two years, it
> > is a thread group total running time
> >
> >
> > void thread_group_cputime_adjusted(struct task_struct *p, u64 *ut, u64
> > *st) {
> >     struct task_cputime cputime;
> >
> >     thread_group_cputime(p, &cputime);
> >     cputime_adjust(&cputime, &p->signal->prev_cputime, ut, st); }
> >
> 
> So you are saying that you have been running this for over two years without a
> reboot?
> 

Yes, Consider more and more CPUs in machine, I think it is common case


> Then the issue isn't the divider, it's that the thread group cputime can overflow.
> Perhaps it needs a cap, or a way to "reset" somehow after "so long"?


Do not clear how to reset

But mul_u64_u64_div_u64() for x86 should not trigger a division error panic, maybe should return a ULLONG_MAX on #DE (like non-x86 mul_u64_u64_div_u64(),)

> 
> -- Steve


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [????] Re: [????] Re: divide error in x86 and cputime
  2025-07-08  1:58           ` 答复: [????] " Li,Rongqing
@ 2025-07-08  2:05             ` Steven Rostedt
  2025-07-08  2:17               ` Oleg Nesterov
  0 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2025-07-08  2:05 UTC (permalink / raw)
  To: Li,Rongqing
  Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
	linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, juri.lelli@redhat.com,
	mingo@redhat.com

On Tue, 8 Jul 2025 01:58:00 +0000
"Li,Rongqing" <lirongqing@baidu.com> wrote:

> But mul_u64_u64_div_u64() for x86 should not trigger a division error panic, maybe should return a ULLONG_MAX on #DE (like non-x86 mul_u64_u64_div_u64(),)

Perhaps. But it is still producing garbage.

-- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [????] Re: [????] Re: divide error in x86 and cputime
  2025-07-08  2:05             ` Steven Rostedt
@ 2025-07-08  2:17               ` Oleg Nesterov
  2025-07-08  9:58                 ` David Laight
  0 siblings, 1 reply; 23+ messages in thread
From: Oleg Nesterov @ 2025-07-08  2:17 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Li,Rongqing, Peter Zijlstra, David Laight,
	linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, juri.lelli@redhat.com,
	mingo@redhat.com

On 07/07, Steven Rostedt wrote:
>
> On Tue, 8 Jul 2025 01:58:00 +0000
> "Li,Rongqing" <lirongqing@baidu.com> wrote:
>
> > But mul_u64_u64_div_u64() for x86 should not trigger a division error panic,
> maybe should return a ULLONG_MAX on #DE (like non-x86 mul_u64_u64_div_u64(),)
>
> Perhaps.

So do you think

	static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
	{
		int ok = 0;
		u64 q;

		asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
			_ASM_EXTABLE(1b, 2b)
			: "=a" (q), "+r" (ok)
			: "a" (a), "rm" (mul), "rm" (div)
			: "rdx");

		return ok ? q : -1ul;
	}

makes sense at least for consistency with the generic implementation
in lib/math/div64.c ?

>  But it is still producing garbage.

Agreed. And not a solution to this particular problem.

Oleg.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [????] Re: [????] Re: divide error in x86 and cputime
  2025-07-08  2:17               ` Oleg Nesterov
@ 2025-07-08  9:58                 ` David Laight
  0 siblings, 0 replies; 23+ messages in thread
From: David Laight @ 2025-07-08  9:58 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Steven Rostedt, Li,Rongqing, Peter Zijlstra,
	linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, juri.lelli@redhat.com,
	mingo@redhat.com

On Tue, 8 Jul 2025 04:17:04 +0200
Oleg Nesterov <oleg@redhat.com> wrote:

> On 07/07, Steven Rostedt wrote:
> >
> > On Tue, 8 Jul 2025 01:58:00 +0000
> > "Li,Rongqing" <lirongqing@baidu.com> wrote:
> >  
> > > But mul_u64_u64_div_u64() for x86 should not trigger a division error panic,  
> > maybe should return a ULLONG_MAX on #DE (like non-x86 mul_u64_u64_div_u64(),)
> >
> > Perhaps.  
> 
> So do you think
> 
> 	static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
> 	{
> 		int ok = 0;
> 		u64 q;
> 
> 		asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
> 			_ASM_EXTABLE(1b, 2b)
> 			: "=a" (q), "+r" (ok)
> 			: "a" (a), "rm" (mul), "rm" (div)
> 			: "rdx");
> 
> 		return ok ? q : -1ul;

You need to decide what to return/do when 'div' is zero.
So perhaps:
		if (ok)
			return q;
		BUG_ON(!div);
		return ~(u64)0;

But maybe 0/0 should return 0.

> 	}
> 
> makes sense at least for consistency with the generic implementation
> in lib/math/div64.c ?

I don't like the way the current version handles divide by zero at all.
Even forcing the cpu to execute a 'divide by zero' doesn't seem right.
The result should be well defined (and useful).
It might even be worth adding an extra parameter to report overflow
and return ~0 for overflow and 0 for divide by zero (I think that is
less likely to cause grief in the following instructions). 
That does 'pass the buck' to the caller.

> 
> >  But it is still producing garbage.  
> 
> Agreed. And not a solution to this particular problem.

Using mul_u64_u_64_div_u64() here is also horribly expensive for a
simple split between (IIRC) utime and stime.
It isn't too bad on x86-64, but everywhere else it is horrid.
For 'random' values the code hits 900 clocks on x86-32 - and that
is in userspace with cmov and %ebp as a general register.
My new version is ~230 for x86-32 and ~130 for x86-64 (not doing
the fast asm) on ivy bridge, ~80 for x86-64 on zen5.
(I'm on holiday and have limited systems available.)

	David

> 
> Oleg.
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: divide error in x86 and cputime
  2025-07-07 22:09 ` Oleg Nesterov
  2025-07-07 22:20   ` Steven Rostedt
@ 2025-07-07 22:30   ` Oleg Nesterov
  2025-07-07 23:41     ` 答复: [????] " Li,Rongqing
  1 sibling, 1 reply; 23+ messages in thread
From: Oleg Nesterov @ 2025-07-07 22:30 UTC (permalink / raw)
  To: Li,Rongqing, Peter Zijlstra, David Laight
  Cc: linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, rostedt@goodmis.org,
	dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
	juri.lelli@redhat.com, mingo@redhat.com

On a second thought, this

    mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626, 0x09e00900);
                        stime               rtime               stime + utime	

looks suspicious:

	- stime > stime + utime

	- rtime = 0xfffd213aabd74626 is absurdly huge

so perhaps there is another problem?

Oleg.

On 07/08, Oleg Nesterov wrote:
>
> On 07/07, Li,Rongqing wrote:
> >
> > [78250815.703847] divide error: 0000 [#1] PREEMPT SMP NOPTI
>
> ...
>
> > It caused by a process with many threads running very long,
> > and utime+stime overflowed 64bit, then cause the below div
> >
> > mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626, 0x09e00900);
> >
> > I see the comments of mul_u64_u64_div_u64() say:
> >
> > Will generate an #DE when the result doesn't fit u64, could fix with an
> > __ex_table[] entry when it becomes an issu
> >
> > Seem __ex_table[] entry for div does not work ?
>
> Well, the current version doesn't have an __ex_table[] entry for div...
>
> I do not know what can/should we do in this case... Perhaps
>
> 	static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
> 	{
> 		int ok = 0;
> 		u64 q;
>
> 		asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
> 			_ASM_EXTABLE(1b, 2b)
> 			: "=a" (q), "+r" (ok)
> 			: "a" (a), "rm" (mul), "rm" (div)
> 			: "rdx");
>
> 		return ok ? q : -1ul;
> 	}
>
> ?
>
> Should return ULLONG_MAX on #DE.
>
> Oleg.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* 答复: [????] Re: divide error in x86 and cputime
  2025-07-07 22:30   ` Oleg Nesterov
@ 2025-07-07 23:41     ` Li,Rongqing
  2025-07-07 23:53       ` Steven Rostedt
  2025-07-08  0:23       ` 答复: " Li,Rongqing
  0 siblings, 2 replies; 23+ messages in thread
From: Li,Rongqing @ 2025-07-07 23:41 UTC (permalink / raw)
  To: Oleg Nesterov, Peter Zijlstra, David Laight
  Cc: linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, rostedt@goodmis.org,
	dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
	juri.lelli@redhat.com, mingo@redhat.com



> On a second thought, this
> 
>     mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> 0x09e00900);
>                         stime               rtime
> stime + utime
> 
> looks suspicious:
> 
> 	- stime > stime + utime
> 
> 	- rtime = 0xfffd213aabd74626 is absurdly huge
> 
> so perhaps there is another problem?
> 

it happened when a process with 236 busy polling threads , run about 904 days, the total time will overflow the 64bit

non-x86 system maybe has same issue, once (stime + utime) overflows 64bit, mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division by 0

so to cputime, could cputime_adjust() return stime if stime if stime + utime is overflow

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 6dab4854..db0c273 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
                goto update;
        }

+       if (stime > (stime + utime)) {
+               goto update;
+       }
+
        stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
        /*
         * Because mul_u64_u64_div_u64() can approximate on some


Thanks

-Li


> Oleg.
> 
> On 07/08, Oleg Nesterov wrote:
> >
> > On 07/07, Li,Rongqing wrote:
> > >
> > > [78250815.703847] divide error: 0000 [#1] PREEMPT SMP NOPTI
> >
> > ...
> >
> > > It caused by a process with many threads running very long, and
> > > utime+stime overflowed 64bit, then cause the below div
> > >
> > > mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> > > 0x09e00900);
> > >
> > > I see the comments of mul_u64_u64_div_u64() say:
> > >
> > > Will generate an #DE when the result doesn't fit u64, could fix with
> > > an __ex_table[] entry when it becomes an issu
> > >
> > > Seem __ex_table[] entry for div does not work ?
> >
> > Well, the current version doesn't have an __ex_table[] entry for div...
> >
> > I do not know what can/should we do in this case... Perhaps
> >
> > 	static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
> > 	{
> > 		int ok = 0;
> > 		u64 q;
> >
> > 		asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
> > 			_ASM_EXTABLE(1b, 2b)
> > 			: "=a" (q), "+r" (ok)
> > 			: "a" (a), "rm" (mul), "rm" (div)
> > 			: "rdx");
> >
> > 		return ok ? q : -1ul;
> > 	}
> >
> > ?
> >
> > Should return ULLONG_MAX on #DE.
> >
> > Oleg.


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [????] Re: divide error in x86 and cputime
  2025-07-07 23:41     ` 答复: [????] " Li,Rongqing
@ 2025-07-07 23:53       ` Steven Rostedt
  2025-07-08  0:10         ` 答复: [????] " Li,Rongqing
  2025-07-08  0:23       ` 答复: " Li,Rongqing
  1 sibling, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2025-07-07 23:53 UTC (permalink / raw)
  To: Li,Rongqing
  Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
	linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, juri.lelli@redhat.com,
	mingo@redhat.com

On Mon, 7 Jul 2025 23:41:14 +0000
"Li,Rongqing" <lirongqing@baidu.com> wrote:

> > On a second thought, this
> > 
> >     mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> > 0x09e00900);
> >                         stime               rtime
> > stime + utime
> > 
> > looks suspicious:
> > 
> > 	- stime > stime + utime
> > 
> > 	- rtime = 0xfffd213aabd74626 is absurdly huge
> > 
> > so perhaps there is another problem?
> >   
> 
> it happened when a process with 236 busy polling threads , run about 904 days, the total time will overflow the 64bit
> 
> non-x86 system maybe has same issue, once (stime + utime) overflows 64bit, mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division by 0
> 
> so to cputime, could cputime_adjust() return stime if stime if stime + utime is overflow
> 
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index 6dab4854..db0c273 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
>                 goto update;
>         }
> 
> +       if (stime > (stime + utime)) {
> +               goto update;
> +       }
> +
>         stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
>         /*
>          * Because mul_u64_u64_div_u64() can approximate on some
> 

Are you running 5.10.0? Because a diff of 5.10.238 from 5.10.0 gives:

@@ -579,6 +579,12 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
        }
 
        stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
+       /*
+        * Because mul_u64_u64_div_u64() can approximate on some
+        * achitectures; enforce the constraint that: a*b/(b+c) <= a.
+        */
+       if (unlikely(stime > rtime))
+               stime = rtime;
 
 update:


Thus the result is what's getting screwed up.

-- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* 答复: [????] Re: [????] Re: divide error in x86 and cputime
  2025-07-07 23:53       ` Steven Rostedt
@ 2025-07-08  0:10         ` Li,Rongqing
  2025-07-08  0:30           ` Steven Rostedt
  0 siblings, 1 reply; 23+ messages in thread
From: Li,Rongqing @ 2025-07-08  0:10 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
	linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, juri.lelli@redhat.com,
	mingo@redhat.com

> On Mon, 7 Jul 2025 23:41:14 +0000
> "Li,Rongqing" <lirongqing@baidu.com> wrote:
> 
> > > On a second thought, this
> > >
> > >     mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> > > 0x09e00900);
> > >                         stime               rtime
> > > stime + utime
> > >
> > > looks suspicious:
> > >
> > > 	- stime > stime + utime
> > >
> > > 	- rtime = 0xfffd213aabd74626 is absurdly huge
> > >
> > > so perhaps there is another problem?
> > >
> >
> > it happened when a process with 236 busy polling threads , run about
> > 904 days, the total time will overflow the 64bit
> >
> > non-x86 system maybe has same issue, once (stime + utime) overflows
> > 64bit, mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division
> > by 0
> >
> > so to cputime, could cputime_adjust() return stime if stime if stime +
> > utime is overflow
> >
> > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index
> > 6dab4854..db0c273 100644
> > --- a/kernel/sched/cputime.c
> > +++ b/kernel/sched/cputime.c
> > @@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr,
> struct prev_cputime *prev,
> >                 goto update;
> >         }
> >
> > +       if (stime > (stime + utime)) {
> > +               goto update;
> > +       }
> > +
> >         stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> >         /*
> >          * Because mul_u64_u64_div_u64() can approximate on some
> >
> 
> Are you running 5.10.0? Because a diff of 5.10.238 from 5.10.0 gives:
> 
> @@ -579,6 +579,12 @@ void cputime_adjust(struct task_cputime *curr, struct
> prev_cputime *prev,
>         }
> 
>         stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> +       /*
> +        * Because mul_u64_u64_div_u64() can approximate on some
> +        * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> +        */
> +       if (unlikely(stime > rtime))
> +               stime = rtime;


My 5.10 has not this patch " sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime ",
but I am sure this patch can not fix this overflow issue, Since division error happened in mul_u64_u64_div_u64()

Thanks

-Li


> 
>  update:
> 
> 
> Thus the result is what's getting screwed up.
> 
> -- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [????] Re: [????] Re: divide error in x86 and cputime
  2025-07-08  0:10         ` 答复: [????] " Li,Rongqing
@ 2025-07-08  0:30           ` Steven Rostedt
  2025-07-08  1:17             ` 答复: [????] " Li,Rongqing
  2025-07-08 10:35             ` [????] Re: [????] " David Laight
  0 siblings, 2 replies; 23+ messages in thread
From: Steven Rostedt @ 2025-07-08  0:30 UTC (permalink / raw)
  To: Li,Rongqing
  Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
	linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, juri.lelli@redhat.com,
	mingo@redhat.com

On Tue, 8 Jul 2025 00:10:54 +0000
"Li,Rongqing" <lirongqing@baidu.com> wrote:

> >         stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > +       /*
> > +        * Because mul_u64_u64_div_u64() can approximate on some
> > +        * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> > +        */
> > +       if (unlikely(stime > rtime))
> > +               stime = rtime;  
> 
> 
> My 5.10 has not this patch " sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime ",
> but I am sure this patch can not fix this overflow issue, Since division error happened in mul_u64_u64_div_u64()

Have you tried it? Or are you just making an assumption?

How can you be so sure? Did you even *look* at the commit?

    sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime
    
    In extreme test scenarios:
    the 14th field utime in /proc/xx/stat is greater than sum_exec_runtime,
    utime = 18446744073709518790 ns, rtime = 135989749728000 ns
    
    In cputime_adjust() process, stime is greater than rtime due to
    mul_u64_u64_div_u64() precision problem.
    before call mul_u64_u64_div_u64(),
    stime = 175136586720000, rtime = 135989749728000, utime = 1416780000.
    after call mul_u64_u64_div_u64(),
    stime = 135989949653530
    
    unsigned reversion occurs because rtime is less than stime.
    utime = rtime - stime = 135989749728000 - 135989949653530
                          = -199925530
                          = (u64)18446744073709518790
    
    Trigger condition:
      1). User task run in kernel mode most of time
      2). ARM64 architecture
      3). TICK_CPU_ACCOUNTING=y
          CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set
    
    Fix mul_u64_u64_div_u64() conversion precision by reset stime to rtime


When stime ends up greater than rtime, it causes utime to go NEGATIVE!

That means *YES* it can overflow a u64 number. That's your bug.

Next time, look to see if there's fixes in the code that is triggering
issues for you and test them out, before bothering upstream.

Goodbye.

-- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* 答复: [????] Re: [????] Re: [????] Re: divide error in x86 and cputime
  2025-07-08  0:30           ` Steven Rostedt
@ 2025-07-08  1:17             ` Li,Rongqing
  2025-07-08  1:41               ` Steven Rostedt
  2025-07-08 10:35             ` [????] Re: [????] " David Laight
  1 sibling, 1 reply; 23+ messages in thread
From: Li,Rongqing @ 2025-07-08  1:17 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
	linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, juri.lelli@redhat.com,
	mingo@redhat.com

> Have you tried it? Or are you just making an assumption?
> 
> How can you be so sure? Did you even *look* at the commit?
> 
>     sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime
> 
>     In extreme test scenarios:
>     the 14th field utime in /proc/xx/stat is greater than sum_exec_runtime,
>     utime = 18446744073709518790 ns, rtime = 135989749728000 ns
> 
>     In cputime_adjust() process, stime is greater than rtime due to
>     mul_u64_u64_div_u64() precision problem.
>     before call mul_u64_u64_div_u64(),
>     stime = 175136586720000, rtime = 135989749728000, utime =
> 1416780000.
>     after call mul_u64_u64_div_u64(),
>     stime = 135989949653530
> 
>     unsigned reversion occurs because rtime is less than stime.
>     utime = rtime - stime = 135989749728000 - 135989949653530
>                           = -199925530
>                           = (u64)18446744073709518790
> 

I will try to tested this patch, But I think it is different case;

Stime is not greater than rtime in my case, (stime= 0x69f98da9ba980c00, rtime= 0xfffd213aabd74626, stime+utime= 0x9e00900. So utime should be 0x960672564f47fd00 ), and this overflow process with 236 busy poll threads running about 904 day, so I think these times are correct


Thanks

-Li

>     Trigger condition:
>       1). User task run in kernel mode most of time
>       2). ARM64 architecture
>       3). TICK_CPU_ACCOUNTING=y
>           CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set
> 
>     Fix mul_u64_u64_div_u64() conversion precision by reset stime to rtime
> 
> 
> When stime ends up greater than rtime, it causes utime to go NEGATIVE!
> 
> That means *YES* it can overflow a u64 number. That's your bug.
> 
> Next time, look to see if there's fixes in the code that is triggering issues for you
> and test them out, before bothering upstream.
> 
> Goodbye.
> 
> -- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: divide error in x86 and cputime
  2025-07-08  1:17             ` 答复: [????] " Li,Rongqing
@ 2025-07-08  1:41               ` Steven Rostedt
  0 siblings, 0 replies; 23+ messages in thread
From: Steven Rostedt @ 2025-07-08  1:41 UTC (permalink / raw)
  To: Li,Rongqing
  Cc: Oleg Nesterov, Peter Zijlstra, David Laight,
	linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, juri.lelli@redhat.com,
	mingo@redhat.com

On Tue, 8 Jul 2025 01:17:50 +0000
"Li,Rongqing" <lirongqing@baidu.com> wrote:

> Stime is not greater than rtime in my case, (stime= 0x69f98da9ba980c00,
> rtime= 0xfffd213aabd74626, stime+utime= 0x9e00900. So utime should be
> 0x960672564f47fd00 ), and this overflow process with 236 busy poll
> threads running about 904 day, so I think these times are correct
> 

But look at rtime, it is *negative*. So maybe that fix isn't going to fix
this bug, but rtime is most definitely screwed up. That value is:

  0xfffd213aabd74626 = (u64)18445936184654251558 = (s64)-807889055300058

There's no way run time should be 584 years in nanoseconds.

So if it's not fixed by that commit, it's a bug that happened before you even
got to the mul_u64_u64_div_u64() function. Touching that is only putting a
band-aid on the symptom, you haven't touched the real bug.

I bet there's likely another fix between what you are using and 5.10.238.
There's 31,101 commits between those two. You are using a way old kernel
without any fixes to it. It is known to be buggy. You will hit bugs with
it. No need to tell us about it.

-- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [????] Re: [????] Re: divide error in x86 and cputime
  2025-07-08  0:30           ` Steven Rostedt
  2025-07-08  1:17             ` 答复: [????] " Li,Rongqing
@ 2025-07-08 10:35             ` David Laight
  2025-07-08 11:12               ` 答复: [????] " Li,Rongqing
  1 sibling, 1 reply; 23+ messages in thread
From: David Laight @ 2025-07-08 10:35 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Li,Rongqing, Oleg Nesterov, Peter Zijlstra,
	linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, dietmar.eggemann@arm.com,
	vincent.guittot@linaro.org, juri.lelli@redhat.com,
	mingo@redhat.com

On Mon, 7 Jul 2025 20:30:57 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Tue, 8 Jul 2025 00:10:54 +0000
> "Li,Rongqing" <lirongqing@baidu.com> wrote:
> 
> > >         stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > > +       /*
> > > +        * Because mul_u64_u64_div_u64() can approximate on some
> > > +        * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> > > +        */
> > > +       if (unlikely(stime > rtime))
> > > +               stime = rtime;    
> > 
> > 
> > My 5.10 has not this patch " sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime ",
> > but I am sure this patch can not fix this overflow issue, Since division error happened in mul_u64_u64_div_u64()  
> 
> Have you tried it? Or are you just making an assumption?
> 
> How can you be so sure? Did you even *look* at the commit?

It can't be relevant.
That change is after the mul_u64_u64_div_u64() call that trapped.
It is also not relevant for x86-64 because it uses the asm version.

At some point mul_u64_u64_div_u64() got changed to be accurate (and slow)
so that check isn't needed any more.

	David

^ permalink raw reply	[flat|nested] 23+ messages in thread

* 答复: [????] Re: [????] Re: [????] Re: divide error in x86 and cputime
  2025-07-08 10:35             ` [????] Re: [????] " David Laight
@ 2025-07-08 11:12               ` Li,Rongqing
  0 siblings, 0 replies; 23+ messages in thread
From: Li,Rongqing @ 2025-07-08 11:12 UTC (permalink / raw)
  To: David Laight, Steven Rostedt
  Cc: Oleg Nesterov, Peter Zijlstra, linux-kernel@vger.kernel.org,
	vschneid@redhat.com, mgorman@suse.de, bsegall@google.com,
	dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
	juri.lelli@redhat.com, mingo@redhat.com

> > On Tue, 8 Jul 2025 00:10:54 +0000
> > "Li,Rongqing" <lirongqing@baidu.com> wrote:
> >
> > > >         stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > > > +       /*
> > > > +        * Because mul_u64_u64_div_u64() can approximate on some
> > > > +        * achitectures; enforce the constraint that: a*b/(b+c) <= a.
> > > > +        */
> > > > +       if (unlikely(stime > rtime))
> > > > +               stime = rtime;
> > >
> > >
> > > My 5.10 has not this patch " sched/cputime: Fix
> > > mul_u64_u64_div_u64() precision for cputime ", but I am sure this
> > > patch can not fix this overflow issue, Since division error happened
> > > in mul_u64_u64_div_u64()
> >
> > Have you tried it? Or are you just making an assumption?
> >
> > How can you be so sure? Did you even *look* at the commit?
> 
> It can't be relevant.
> That change is after the mul_u64_u64_div_u64() call that trapped.
> It is also not relevant for x86-64 because it uses the asm version.
> 
> At some point mul_u64_u64_div_u64() got changed to be accurate (and slow) so
> that check isn't needed any more.
> 

I see this patch not relevant

Thank you very much for your confirmation

-Li

^ permalink raw reply	[flat|nested] 23+ messages in thread

* 答复: [????] Re: divide error in x86 and cputime
  2025-07-07 23:41     ` 答复: [????] " Li,Rongqing
  2025-07-07 23:53       ` Steven Rostedt
@ 2025-07-08  0:23       ` Li,Rongqing
  1 sibling, 0 replies; 23+ messages in thread
From: Li,Rongqing @ 2025-07-08  0:23 UTC (permalink / raw)
  To: Oleg Nesterov, Peter Zijlstra, David Laight
  Cc: linux-kernel@vger.kernel.org, vschneid@redhat.com,
	mgorman@suse.de, bsegall@google.com, rostedt@goodmis.org,
	dietmar.eggemann@arm.com, vincent.guittot@linaro.org,
	juri.lelli@redhat.com, mingo@redhat.com

> non-x86 system maybe has same issue, once (stime + utime) overflows 64bit,
> mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division by 0
> 

Correct this, mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626, 0x009e00900) oflib/math/div64.c returns 0xffffffffffffffff


> so to cputime, could cputime_adjust() return stime if stime if stime + utime is
> overflow
> 
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index
> 6dab4854..db0c273 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr, struct
> prev_cputime *prev,
>                 goto update;
>         }
> 
> +       if (stime > (stime + utime)) {
> +               goto update;
> +       }
> +
>         stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
>         /*
>          * Because mul_u64_u64_div_u64() can approximate on some
> 
> 
> Thanks
> 
> -Li
> 
> 
> > Oleg.
> >
> > On 07/08, Oleg Nesterov wrote:
> > >
> > > On 07/07, Li,Rongqing wrote:
> > > >
> > > > [78250815.703847] divide error: 0000 [#1] PREEMPT SMP NOPTI
> > >
> > > ...
> > >
> > > > It caused by a process with many threads running very long, and
> > > > utime+stime overflowed 64bit, then cause the below div
> > > >
> > > > mul_u64_u64_div_u64(0x69f98da9ba980c00, 0xfffd213aabd74626,
> > > > 0x09e00900);
> > > >
> > > > I see the comments of mul_u64_u64_div_u64() say:
> > > >
> > > > Will generate an #DE when the result doesn't fit u64, could fix
> > > > with an __ex_table[] entry when it becomes an issu
> > > >
> > > > Seem __ex_table[] entry for div does not work ?
> > >
> > > Well, the current version doesn't have an __ex_table[] entry for div...
> > >
> > > I do not know what can/should we do in this case... Perhaps
> > >
> > > 	static inline u64 mul_u64_u64_div_u64(u64 a, u64 mul, u64 div)
> > > 	{
> > > 		int ok = 0;
> > > 		u64 q;
> > >
> > > 		asm ("mulq %3; 1: divq %4; movl $1,%1; 2:\n"
> > > 			_ASM_EXTABLE(1b, 2b)
> > > 			: "=a" (q), "+r" (ok)
> > > 			: "a" (a), "rm" (mul), "rm" (div)
> > > 			: "rdx");
> > >
> > > 		return ok ? q : -1ul;
> > > 	}
> > >
> > > ?
> > >
> > > Should return ULLONG_MAX on #DE.
> > >
> > > Oleg.


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2025-07-08 11:14 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-07  8:14 divide error in x86 and cputime Li,Rongqing
2025-07-07 15:11 ` Steven Rostedt
2025-07-07 22:09 ` Oleg Nesterov
2025-07-07 22:20   ` Steven Rostedt
2025-07-07 22:33     ` Steven Rostedt
2025-07-07 23:00       ` Oleg Nesterov
2025-07-08 11:00         ` David Laight
2025-07-08  1:40       ` 答复: [????] " Li,Rongqing
2025-07-08  1:53         ` Steven Rostedt
2025-07-08  1:58           ` 答复: [????] " Li,Rongqing
2025-07-08  2:05             ` Steven Rostedt
2025-07-08  2:17               ` Oleg Nesterov
2025-07-08  9:58                 ` David Laight
2025-07-07 22:30   ` Oleg Nesterov
2025-07-07 23:41     ` 答复: [????] " Li,Rongqing
2025-07-07 23:53       ` Steven Rostedt
2025-07-08  0:10         ` 答复: [????] " Li,Rongqing
2025-07-08  0:30           ` Steven Rostedt
2025-07-08  1:17             ` 答复: [????] " Li,Rongqing
2025-07-08  1:41               ` Steven Rostedt
2025-07-08 10:35             ` [????] Re: [????] " David Laight
2025-07-08 11:12               ` 答复: [????] " Li,Rongqing
2025-07-08  0:23       ` 答复: " Li,Rongqing

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).