From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wr1-f52.google.com (mail-wr1-f52.google.com [209.85.221.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 335E42D5937 for ; Sun, 4 Jan 2026 18:15:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.52 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767550508; cv=none; b=nVGHvHv2I6lQgD+wGOL3YDSDhaJozKBpXKpNcQzk9wUpE3RPRMubCbxgL49TZSFz85tr2g1OjJE6YqjmLE/hkT2QmYix+VezkJHDRKFRT6mtT4xczvWB1oPCFvIKrkugWsyGPdSmF150AA5Z2TAw7YqD0vgRbixc4DydyfH2Jeo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767550508; c=relaxed/simple; bh=KhleBjFxl334u8aK4RzZeYpAfm4w+bTBqIHz7KnIflY=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=jFRkRlYUF7OPA5zTNGTpLOnJRJsghHc2eJV/jdlV1873Fnbf/1Pu79OOm484jcFQSP46hSGv/WJ14VW5AhV/rkndET5ueaS4TJmfxHtcAIgEJhOhFkiho4l82q+GODEiRm23axlw5MWDNnhpfhzzrQD0Gw/HzmItK5ildsW0W0w= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=MCo9b3XR; arc=none smtp.client-ip=209.85.221.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MCo9b3XR" Received: by mail-wr1-f52.google.com with SMTP id ffacd0b85a97d-42fbc305914so9008576f8f.0 for ; Sun, 04 Jan 2026 10:15:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767550504; x=1768155304; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=hvPBewusTj+2pXtTTc5GOxOA1f9vTdwSnFSlAlf2rfw=; b=MCo9b3XR9RVGsd1+7ZHduBzR4W0aou5epYqkIrEzYyITJ9fKRwxmvUdYjQvu1QXiiF 2kFqLaO75DDBSyA3CR5vCxNy5ofwsVYLZ21CgNE4RT+jFZn7IG57f+F7V9KLr8DqPM+j kf3VExg3UezZclz1qToc4cUTJH/0qequRrg50nS0GeBArkkGGQq6vyLeZl9soVpkSUcj NQ0qidTIY3+licyAo+8cbtJMpzgc1Pp2YkM3HgbgR+CkRDmotTksF/5R4H4RoIk6DmpR LLhYkM/oOR4pvcDZvT3QHct699DJ2pAZnSCYvKY3MgB0qQWJCmHyti2PCa2j/sUs+yzy Nsqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767550504; x=1768155304; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=hvPBewusTj+2pXtTTc5GOxOA1f9vTdwSnFSlAlf2rfw=; b=IhwwcNjpteWjL3DMOMFCtymoFsdC8CHzrKKxU6tHY7z0XPiPga2ClgnVgUa3ENllLd gd9Lk7lBMTVPkpyjhxsDHRID4GdbbFNfKpjkX0pkAA25dE1NbJi3129au9+erODS2tAW xZtMiIoEE8D0v3Vh+orgvwQfR5px9SdmMYXQeSltSOxTtt2YBHUnuX+RsmgG3YkLWsu7 DD/3ncUrOLCc7ftogGGsonF8k9gJXLKNJqlpl38ab/rp+PQiXT7VOU4Qp3f+bhDXo15O ugICARgs7M6aqXKybrEiydCa3hto8+dorwaIrMaYM9mUQd0zmXXiIP56SEIoBciT1hp9 sJPg== X-Forwarded-Encrypted: i=1; AJvYcCVGatL1L9RO+t9SaFH63VN8xn1UuuhUXm0VvSsm6Q1R5yhftx69mvjlKvnl2DoXzSN4kLrnXzb9OHsUHQ8=@vger.kernel.org X-Gm-Message-State: AOJu0YwOkG9VZnjo+9LjgEVK4EizoJzs2uof5TtHJt9tPGDB2uPTWPzi Uqh4NDqERyCZ4S6Mei6GF1ebiOv1xrcvSipIaXxTr/LQy1x8CjE9NYxF X-Gm-Gg: AY/fxX4SZEKscUzP+6gCfUgVGdDxTTtRtUb0Ffe2/+OHamuJ+7wx4vSqsmXXTgrFVE8 mzQGaQyMgBD6v2ASkJCCNsr7pi7eK0jnfrD1+PqWRFDymapNoVyoIRkD8LHwL6O+I4RL63VMit1 Cs4fvILL4ehfCVlIej2vFHXLyg6Kab60etpj1E7umlmnGs8fZUutXzHF052mzFIE5w5rnUHUieC I+2PpURfJwdcRqhu40Nu/COsjXeYMBCIgvORho5/XvAP1EpBaaWR4Pfw9JAd2YUgId9iaV5goUa NsgDKxW+jMNmb0QYt9pcGymAcopY1mHu+BYl700JwIyGx21LkXEGVKPPZ2qB+lvM4IXQH0XiEVl AeS2aE60Jzx5po4PwJbGIV6w0hJAVSLX47WTFQ6Y6pywjKvfUqU+BwNc04yEAKyi5W4OK9VX/Cw Gc56s81pLDWSbl6jMF+g8iJQEzoLFNWsOpaK5vS44ptjHxJxe/k3gJ X-Google-Smtp-Source: AGHT+IH0kx8VINqWHViFSyO0x/O+Zuh+cnmBCcOnrmi5aKRFkIdBSLlfMtFhZumYuee9R9sE9MvBQw== X-Received: by 2002:a05:6000:250a:b0:431:66a:cbc4 with SMTP id ffacd0b85a97d-4324e4cc43dmr56520325f8f.17.1767550504063; Sun, 04 Jan 2026 10:15:04 -0800 (PST) Received: from pumpkin (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4324ea2278dsm98221803f8f.18.2026.01.04.10.15.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 04 Jan 2026 10:15:03 -0800 (PST) Date: Sun, 4 Jan 2026 18:15:01 +0000 From: David Laight To: Oleg Nesterov Cc: Xia Fukun , Peter Zijlstra , Ingo Molnar , "Li,Rongqing" , "linux-kernel@vger.kernel.org" , "vschneid@redhat.com" , "mgorman@suse.de" , "bsegall@google.com" , "rostedt@goodmis.org" , "dietmar.eggemann@arm.com" , "vincent.guittot@linaro.org" , "juri.lelli@redhat.com" , "Zhangqiao (2012 lab)" Subject: Re: [????] Re: divide error in x86 and cputime Message-ID: <20260104181501.740884d5@pumpkin> In-Reply-To: References: <78a0d7bb20504c0884d474868eccd858@baidu.com> <20250707220937.GA15787@redhat.com> <20250707223038.GB15787@redhat.com> <2ef88def90634827bac1874d90e0e329@baidu.com> <518a5712-3348-4acd-ae26-529621143ef7@huawei.com> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; arm-unknown-linux-gnueabihf) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Sun, 4 Jan 2026 15:23:19 +0100 Oleg Nesterov wrote: > Peter, Ingo, > > can you take > > [PATCH v3 0/2] x86/math64: handle #DE in mul_u64_u64_div_u64() > https://lore.kernel.org/all/20250815164009.GA11676@redhat.com/ > > ? at least 1/2 which fixes the problem with #DE ... I need to look at the state of my mul_u64_u64_div64() patch as well. I think that has got lost somewhere. Partially due to arguments about how to handle overflow and divide by zero. I don't see a problem returning ~0ull for both - it is extremely unlikely to be a valid result (esp. for code that doesn't need to handle overflow). But this code needs a completely different fix. Either the total runtime needs holding in a some other units, or the calculation needs to use the 'delta runtime' rather than 'absolute runtime' so that module arithmetic avoids the overflow. The extra check before the divide will stop the panic, but the returned value isn't going to be correct. After 'not much longer' utime will be large enough that the divide no longer overflows - at which point the calculated value is complete garbage. David > > Oleg. > > On 01/04, Xia Fukun wrote: > > > > On 7/8/2025 7:41 AM, Li,Rongqing wrote: > > > > > > it happened when a process with 236 busy polling threads , run about 904 days, the total time will overflow the 64bit > > > > > > non-x86 system maybe has same issue, once (stime + utime) overflows 64bit, mul_u64_u64_div_u64 from lib/math/div64.c maybe cause division by 0 > > > > > > > We have encountered the same issue in an environment with x86 architecture and kernel version 5.10. > > > > [48734536.498953] divide error: 0000 [#1] SMP NOPTI > > [48734536.504336] CPU: 273 PID: 4619 Comm: nano-sysmonitor Kdump: loaded Tainted: G OE 5.10.0-60.18.0.50.r1209_60_175.hce2.x86_64 #1 > > [48734536.518065] Hardware name: XFUSION 5885H V7/BC15MBHA, BIOS 01.02.01.03 01/01/2024 > > [48734536.526620] RIP: 0010:cputime_adjust+0x55/0xb0 > > [48734536.532093] Code: 0b 48 8b 7d 10 49 89 c0 48 8d 04 0e 48 39 f8 73 38 48 8b 45 00 48 8b 55 08 48 85 c0 74 16 48 85 d2 74 4d 4c 8d 0c 10 48 f7 e7 <49> f7 f1 48 39 c6 48 0f 42 f0 48 89 f8 48 29 f0 48 39 c1 77 29 48 > > [48734536.552057] RSP: 0018:ffffae408e07bbc8 EFLAGS: 00010807 > > [48734536.558328] RAX: 2facb95ea704eb6a RBX: ffff98b6293db180 RCX: fff9b822b886cabf > > [48734536.566529] RDX: 0005cf0f135b9489 RSI: 0005cf0ec21afa94 RDI: ffff93c922ae82ee > > [48734536.574727] RBP: ffffae408e07bbf8 R08: 0000000000000082 R09: 000007333e295d49 > > [48734536.582930] R10: 8000000000000000 R11: 0000000000000000 R12: ffffae408e07bcf8 > > [48734536.591131] R13: ffffae408e07bcf0 R14: ffff98b6293db190 R15: fffa2e80e26a98fd > > [48734536.599334] FS: 00007f0bc58c3740(0000) GS:ffff98bb75040000(0000) knlGS:0000000000000000 > > [48734536.608498] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [48734536.615294] CR2: 0000557c0ddca1c8 CR3: 00000600ae12a002 CR4: 0000000000372ee0 > > [48734536.623497] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > [48734536.631697] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 > > [48734536.639898] Call Trace: > > [48734536.712624] thread_group_cputime_adjusted+0x4b/0x70 > > [48734536.718634] do_task_stat+0x2d8/0xdc0 > > [48734536.723326] task_info_proc_get_info+0x133/0x150 > > > > > > Specifically, a division error occurs in cputime_adjust() during the following calculation: > > > > mul_u64_u64_div_u64(0x5cf1187f5ad33, 0xffff93c922ae82ee, 0x7333e295d49) > > > > Is the patch provided here feasible? Or are there any known workarounds? > > > > > so to cputime, could cputime_adjust() return stime if stime if stime + utime is overflow > > > > > > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c > > > index 6dab4854..db0c273 100644 > > > --- a/kernel/sched/cputime.c > > > +++ b/kernel/sched/cputime.c > > > @@ -579,6 +579,10 @@ void cputime_adjust(struct task_cputime *curr, struct prev_cputime *prev, > > > goto update; > > > } > > > > > > + if (stime > (stime + utime)) { > > > + goto update; > > > + } > > > + > > > stime = mul_u64_u64_div_u64(stime, rtime, stime + utime); > > > /* > > > * Because mul_u64_u64_div_u64() can approximate on some > > > > > > > > > >