From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9CED4211A28 for ; Sun, 4 Jan 2026 14:23:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767536618; cv=none; b=hij4lQjK5qx0QgN7l5ATvkWvGQFYCqRIiF5U7Y0ijhWaJQJh3ayJis6Hn4JxgLY4aFhKzSkVWSznhLQiswcoFYuwFUA9/TkyESwIcw9V5l8DW9/aFRdbd+Hb2rmn0j+uhxoiUkZuEOSAGFp7Q6khihPMkOs1LM8JbNEAqWvr0ak= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767536618; c=relaxed/simple; bh=Rbmdxg424sU4GzX3eLW71vepErKaGR2XP8d25IJObrI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=o/uJnNBvv+oZUzDjVdtEM8nE07bdSrX/uzFVB/7CPml9a8K0kb6arndgDbBO9TzcFibokNsFDqIDfZaxKmtF7H9iWorWAlt6XHcRSmyCJppSmP6xPIeviMp/TBXMfgAGppRlzDxux1+LZNkInld2RC0w44OKr9jEeHuVKU78H0Q= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=T5kWA2N+; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="T5kWA2N+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1767536615; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=ITC7CsCFHz0SBfiO4tOckLqae1Ovax3vg4KHBRXJvhI=; b=T5kWA2N+dTlfHNT46wUBLMFJs3khQkoe97FjmPa30fSQkE1GgtUqt19r3kkzWA3KLRjiLn mBfmZWIxK82cIZ3/GFgKX1QTjTTnA2b7RR5i/pLZiQ9qV4wXODE9sKdDGyvQtUJCH2+KqR jPSEpOG0Eyorc93qpku2agiowlth/Dg= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-202-k5pVjmf7MDCFpwHsMhdr5A-1; Sun, 04 Jan 2026 09:23:31 -0500 X-MC-Unique: k5pVjmf7MDCFpwHsMhdr5A-1 X-Mimecast-MFC-AGG-ID: k5pVjmf7MDCFpwHsMhdr5A_1767536608 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 25F6B180035A; Sun, 4 Jan 2026 14:23:28 +0000 (UTC) Received: from fedora (unknown [10.45.224.28]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with SMTP id 03B1719560AB; Sun, 4 Jan 2026 14:23:21 +0000 (UTC) Received: by fedora (nbSMTP-1.00) for uid 1000 oleg@redhat.com; Sun, 4 Jan 2026 15:23:27 +0100 (CET) Date: Sun, 4 Jan 2026 15:23:19 +0100 From: Oleg Nesterov To: Xia Fukun , Peter Zijlstra , Ingo Molnar Cc: "Li,Rongqing" , David Laight , "linux-kernel@vger.kernel.org" , "vschneid@redhat.com" , "mgorman@suse.de" , "bsegall@google.com" , "rostedt@goodmis.org" , "dietmar.eggemann@arm.com" , "vincent.guittot@linaro.org" , "juri.lelli@redhat.com" , "Zhangqiao (2012 lab)" Subject: Re: =?utf-8?B?562U5aSN?= =?utf-8?Q?=3A?= [????] Re: divide error in x86 and cputime Message-ID: References: <78a0d7bb20504c0884d474868eccd858@baidu.com> <20250707220937.GA15787@redhat.com> <20250707223038.GB15787@redhat.com> <2ef88def90634827bac1874d90e0e329@baidu.com> <518a5712-3348-4acd-ae26-529621143ef7@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <518a5712-3348-4acd-ae26-529621143ef7@huawei.com> X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Peter, Ingo, can you take [PATCH v3 0/2] x86/math64: handle #DE in mul_u64_u64_div_u64() https://lore.kernel.org/all/20250815164009.GA11676@redhat.com/ ? at least 1/2 which fixes the problem with #DE ... Oleg. On 01/04, Xia Fukun wrote: > > On 7/8/2025 7:41 AM, Li,Rongqing wrote: > > > > it happened when a process with 236 busy polling threads , run about 904 days, the total time will overflow the 64bit > > > > non-x86 system maybe has same issue, once (stime + utime) overflows 64bit, mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division by 0 > > > > We have encountered the same issue in an environment with x86 architecture and kernel version 5.10. > > [48734536.498953] divide error: 0000 [#1] SMP NOPTI > [48734536.504336] CPU: 273 PID: 4619 Comm: nano-sysmonitor Kdump: loaded Tainted: G OE 5.10.0-60.18.0.50.r1209_60_175.hce2.x86_64 #1 > [48734536.518065] Hardware name: XFUSION 5885H V7/BC15MBHA, BIOS 01.02.01.03 01/01/2024 > [48734536.526620] RIP: 0010:cputime_adjust+0x55/0xb0 > [48734536.532093] Code: 0b 48 8b 7d 10 49 89 c0 48 8d 04 0e 48 39 f8 73 38 48 8b 45 00 48 8b 55 08 48 85 c0 74 16 48 85 d2 74 4d 4c 8d 0c 10 48 f7 e7 <49> f7 f1 48 39 c6 48 0f 42 f0 48 89 f8 48 29 f0 48 39 c1 77 29 48 > [48734536.552057] RSP: 0018:ffffae408e07bbc8 EFLAGS: 00010807 > [48734536.558328] RAX: 2facb95ea704eb6a RBX: ffff98b6293db180 RCX: fff9b822b886cabf > [48734536.566529] RDX: 0005cf0f135b9489 RSI: 0005cf0ec21afa94 RDI: ffff93c922ae82ee > [48734536.574727] RBP: ffffae408e07bbf8 R08: 0000000000000082 R09: 000007333e295d49 > [48734536.582930] R10: 8000000000000000 R11: 0000000000000000 R12: ffffae408e07bcf8 > [48734536.591131] R13: ffffae408e07bcf0 R14: ffff98b6293db190 R15: fffa2e80e26a98fd > [48734536.599334] FS: 00007f0bc58c3740(0000) GS:ffff98bb75040000(0000) knlGS:0000000000000000 > [48734536.608498] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [48734536.615294] CR2: 0000557c0ddca1c8 CR3: 00000600ae12a002 CR4: 0000000000372ee0 > [48734536.623497] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [48734536.631697] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 > [48734536.639898] Call Trace: > [48734536.712624] thread_group_cputime_adjusted+0x4b/0x70 > [48734536.718634] do_task_stat+0x2d8/0xdc0 > [48734536.723326] task_info_proc_get_info+0x133/0x150 > > > Specifically, a division error occurs in cputime_adjust() during the following calculation: > > mul_u64_u64_div_u64(0x5cf1187f5ad33, 0xffff93c922ae82ee, 0x7333e295d49) > > Is the patch provided here feasible? Or are there any known workarounds? > > > so to cputime, could cputime_adjust() return stime if stime if stime + utime is overflow > > > > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c > > index 6dab4854..db0c273 100644 > > --- a/kernel/sched/cputime.c > > +++ b/kernel/sched/cputime.c > > @@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev, > > goto update; > > } > > > > + if (stime > (stime + utime)) { > > + goto update; > > + } > > + > > stime = mul_u64_u64_div_u64(stime, rtime, stime + utime); > > /* > > * Because mul_u64_u64_div_u64() can approximate on some > > > > >