From: David Laight <david.laight.linux@gmail.com>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Xia Fukun <xiafukun@huawei.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
"Li,Rongqing" <lirongqing@baidu.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"vschneid@redhat.com" <vschneid@redhat.com>,
"mgorman@suse.de" <mgorman@suse.de>,
"bsegall@google.com" <bsegall@google.com>,
"rostedt@goodmis.org" <rostedt@goodmis.org>,
"dietmar.eggemann@arm.com" <dietmar.eggemann@arm.com>,
"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
"juri.lelli@redhat.com" <juri.lelli@redhat.com>,
"Zhangqiao (2012 lab)" <zhangqiao22@huawei.com>
Subject: Re: [????] Re: divide error in x86 and cputime
Date: Sun, 4 Jan 2026 18:15:01 +0000 [thread overview]
Message-ID: <20260104181501.740884d5@pumpkin> (raw)
In-Reply-To: <aVp315EH_asNbC56@redhat.com>
On Sun, 4 Jan 2026 15:23:19 +0100
Oleg Nesterov <oleg@redhat.com> wrote:
> Peter, Ingo,
>
> can you take
>
> [PATCH v3 0/2] x86/math64: handle #DE in mul_u64_u64_div_u64()
> https://lore.kernel.org/all/20250815164009.GA11676@redhat.com/
>
> ? at least 1/2 which fixes the problem with #DE ...
I need to look at the state of my mul_u64_u64_div64() patch as well.
I think that has got lost somewhere.
Partially due to arguments about how to handle overflow and divide by zero.
I don't see a problem returning ~0ull for both - it is extremely unlikely
to be a valid result (esp. for code that doesn't need to handle overflow).
But this code needs a completely different fix.
Either the total runtime needs holding in a some other units, or the calculation
needs to use the 'delta runtime' rather than 'absolute runtime' so that
module arithmetic avoids the overflow.
The extra check before the divide will stop the panic, but the returned value
isn't going to be correct.
After 'not much longer' utime will be large enough that the divide no
longer overflows - at which point the calculated value is complete garbage.
David
>
> Oleg.
>
> On 01/04, Xia Fukun wrote:
> >
> > On 7/8/2025 7:41 AM, Li,Rongqing wrote:
> > >
> > > it happened when a process with 236 busy polling threads , run about 904 days, the total time will overflow the 64bit
> > >
> > > non-x86 system maybe has same issue, once (stime + utime) overflows 64bit, mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division by 0
> > >
> >
> > We have encountered the same issue in an environment with x86 architecture and kernel version 5.10.
> >
> > [48734536.498953] divide error: 0000 [#1] SMP NOPTI
> > [48734536.504336] CPU: 273 PID: 4619 Comm: nano-sysmonitor Kdump: loaded Tainted: G OE 5.10.0-60.18.0.50.r1209_60_175.hce2.x86_64 #1
> > [48734536.518065] Hardware name: XFUSION 5885H V7/BC15MBHA, BIOS 01.02.01.03 01/01/2024
> > [48734536.526620] RIP: 0010:cputime_adjust+0x55/0xb0
> > [48734536.532093] Code: 0b 48 8b 7d 10 49 89 c0 48 8d 04 0e 48 39 f8 73 38 48 8b 45 00 48 8b 55 08 48 85 c0 74 16 48 85 d2 74 4d 4c 8d 0c 10 48 f7 e7 <49> f7 f1 48 39 c6 48 0f 42 f0 48 89 f8 48 29 f0 48 39 c1 77 29 48
> > [48734536.552057] RSP: 0018:ffffae408e07bbc8 EFLAGS: 00010807
> > [48734536.558328] RAX: 2facb95ea704eb6a RBX: ffff98b6293db180 RCX: fff9b822b886cabf
> > [48734536.566529] RDX: 0005cf0f135b9489 RSI: 0005cf0ec21afa94 RDI: ffff93c922ae82ee
> > [48734536.574727] RBP: ffffae408e07bbf8 R08: 0000000000000082 R09: 000007333e295d49
> > [48734536.582930] R10: 8000000000000000 R11: 0000000000000000 R12: ffffae408e07bcf8
> > [48734536.591131] R13: ffffae408e07bcf0 R14: ffff98b6293db190 R15: fffa2e80e26a98fd
> > [48734536.599334] FS: 00007f0bc58c3740(0000) GS:ffff98bb75040000(0000) knlGS:0000000000000000
> > [48734536.608498] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [48734536.615294] CR2: 0000557c0ddca1c8 CR3: 00000600ae12a002 CR4: 0000000000372ee0
> > [48734536.623497] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [48734536.631697] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
> > [48734536.639898] Call Trace:
> > [48734536.712624] thread_group_cputime_adjusted+0x4b/0x70
> > [48734536.718634] do_task_stat+0x2d8/0xdc0
> > [48734536.723326] task_info_proc_get_info+0x133/0x150
> >
> >
> > Specifically, a division error occurs in cputime_adjust() during the following calculation:
> >
> > mul_u64_u64_div_u64(0x5cf1187f5ad33, 0xffff93c922ae82ee, 0x7333e295d49)
> >
> > Is the patch provided here feasible? Or are there any known workarounds?
> >
> > > so to cputime, could cputime_adjust() return stime if stime if stime + utime is overflow
> > >
> > > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> > > index 6dab4854..db0c273 100644
> > > --- a/kernel/sched/cputime.c
> > > +++ b/kernel/sched/cputime.c
> > > @@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev,
> > > goto update;
> > > }
> > >
> > > + if (stime > (stime + utime)) {
> > > + goto update;
> > > + }
> > > +
> > > stime = mul_u64_u64_div_u64(stime, rtime, stime + utime);
> > > /*
> > > * Because mul_u64_u64_div_u64() can approximate on some
> > >
> >
> >
> >
>
next prev parent reply other threads:[~2026-01-04 18:15 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-07 8:14 divide error in x86 and cputime Li,Rongqing
2025-07-07 15:11 ` Steven Rostedt
2025-07-07 22:09 ` Oleg Nesterov
2025-07-07 22:20 ` Steven Rostedt
2025-07-07 22:33 ` Steven Rostedt
2025-07-07 23:00 ` Oleg Nesterov
2025-07-08 11:00 ` David Laight
2025-07-08 1:40 ` 答复: [????] " Li,Rongqing
2025-07-08 1:53 ` Steven Rostedt
2025-07-08 1:58 ` 答复: [????] " Li,Rongqing
2025-07-08 2:05 ` Steven Rostedt
2025-07-08 2:17 ` Oleg Nesterov
2025-07-08 9:58 ` David Laight
2025-07-07 22:30 ` Oleg Nesterov
2025-07-07 23:41 ` 答复: [????] " Li,Rongqing
2025-07-07 23:53 ` Steven Rostedt
2025-07-08 0:10 ` 答复: [????] " Li,Rongqing
2025-07-08 0:30 ` Steven Rostedt
2025-07-08 1:17 ` 答复: [????] " Li,Rongqing
2025-07-08 1:41 ` Steven Rostedt
2025-07-08 10:35 ` [????] Re: [????] " David Laight
2025-07-08 11:12 ` 答复: [????] " Li,Rongqing
2025-07-08 0:23 ` 答复: " Li,Rongqing
2026-01-04 13:23 ` Xia Fukun
2026-01-04 14:23 ` Oleg Nesterov
2026-01-04 18:15 ` David Laight [this message]
2026-01-04 20:30 ` Oleg Nesterov
2026-01-04 22:03 ` David Laight
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260104181501.740884d5@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lirongqing@baidu.com \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=xiafukun@huawei.com \
--cc=zhangqiao22@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox