From mboxrd@z Thu Jan 1 00:00:00 1970 From: Petr Mladek Subject: Re: [PATCH] thermal/powerclamp: Prevent division by zero when counting interval Date: Fri, 5 Aug 2016 15:12:42 +0200 Message-ID: <20160805131242.GA10099@pathway.suse.cz> References: <1470322606-12435-1-git-send-email-pmladek@suse.com> <20160804103200.486fb236@jacob-builder> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from mx2.suse.de ([195.135.220.15]:46796 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964964AbcHENMp (ORCPT ); Fri, 5 Aug 2016 09:12:45 -0400 Content-Disposition: inline In-Reply-To: <20160804103200.486fb236@jacob-builder> Sender: linux-pm-owner@vger.kernel.org List-Id: linux-pm@vger.kernel.org To: Jacob Pan Cc: Zhang Rui , Eduardo Valentin , linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org On Thu 2016-08-04 10:32:00, Jacob Pan wrote: > On Thu, 4 Aug 2016 16:56:46 +0200 > Petr Mladek wrote: > > > I have got a zero division error when disabling the forced > > idle injection from the intel powerclamp. I did > > > > echo 0 >/sys/class/thermal/cooling_device48/cur_state > > > > and got > > > > [ 986.072632] divide error: 0000 [#1] PREEMPT SMP > > [ 986.078989] Modules linked in: > > [ 986.083618] CPU: 17 PID: 24967 Comm: kidle_inject/17 Not tainted > > 4.7.0-1-default+ #3055 [ 986.093781] Hardware name: Intel > > Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.R3.27.D685.1305151734 > > 05/15/2013 [ 986.106227] task: ffff880430e1c080 task.stack: > > ffff880427ef0000 [ 986.114122] RIP: 0010:[] > > [] clamp_thread+0x1d9/0x600 [ 986.124609] RSP: > > 0018:ffff880427ef3e20 EFLAGS: 00010246 [ 986.131860] RAX: > > 0000000000000258 RBX: 0000000000000006 RCX: 0000000000000001 > > [ 986.141179] RDX: 0000000000000000 RSI: 0000000000000000 RDI: > > 0000000000000018 [ 986.150478] RBP: ffff880427ef3ec8 R08: > > ffff880427ef0000 R09: 0000000000000002 [ 986.159779] R10: > > 0000000000003df2 R11: 0000000000000018 R12: 0000000000000002 > > [ 986.169089] R13: 0000000000000000 R14: ffff880427ef0000 R15: > > ffff880427ef0000 [ 986.178388] FS: 0000000000000000(0000) > > GS:ffff880435940000(0000) knlGS:0000000000000000 [ 986.188785] CS: > > 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 986.196559] CR2: > > 00007f1d0caf0000 CR3: 0000000002006000 CR4: 00000000001406e0 > > [ 986.205909] Stack: [ 986.209524] ffff8802be897b00 > > ffff880430e1c080 0000000000000011 0000006a35959780 [ 986.219236] > > 0000000000000011 ffff880427ef0008 0000000000000000 ffff8804359503d0 > > [ 986.228966] 0000000100029d93 ffffffff81794140 0000000000000000 > > ffffffff05000011 [ 986.238686] Call Trace: [ 986.242825] > > [] ? pkg_state_counter+0x80/0x80 [ 986.250866] > > [] ? powerclamp_set_cur_state+0x180/0x180 > > [ 986.259797] [] kthread+0xc9/0xe0 > > [ 986.266682] [] ret_from_fork+0x1f/0x40 > > [ 986.274142] [] ? > > kthread_create_on_node+0x180/0x180 [ 986.282869] Code: d1 ea 48 89 > > d6 80 3d 6a d0 d4 00 00 ba 64 00 00 00 89 d8 41 0f 45 f5 0f af c2 42 > > 8d 14 2e be 31 00 00 00 83 fa 31 0f 42 f2 31 d2 f6 48 8b 15 9e > > 07 87 00 48 8b 3d 97 07 87 00 48 63 f0 83 e8 [ 986.307806] RIP > > [] clamp_thread+0x1d9/0x600 [ 986.315871] RSP > > > > > > RIP points to the following lines: > > > > compensation = get_compensation(target_ratio); > > interval = duration_jiffies*100/(target_ratio+compensation); > > > > A solution would be to switch the following two commands in > > powerclamp_set_cur_state(): > > > > set_target_ratio = 0; > > end_power_clamp(); > > > I see, there is race condition, clamping threads should be stopped if > target ratio is 0. > > But I think that the zero division might happen also when target_ratio > > is non-zero because the compensation might be negative. Therefore > > it is better to check the sum of target_ratio and compensation > > explicitly. > > > compensation should never be negative. since it is the additional idle > ratio added on top of requested ratio. I am not sure if you are talking about the desired behavior or the current code. get_compensation() returns value computed from steady_comp values. These values are assigned in adjust_compensation() and the code seems to store even negative values. But I did not tried to investigate it much deeper. > If actual idle is more than requested, we will skip injection period. > So i prefer to have both changes. OK, I'll send an updated patch. Best Regards, Petr