linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] thermal/powerclamp: Prevent division by zero when counting interval
@ 2016-08-04 14:56 Petr Mladek
  2016-08-04 17:32 ` Jacob Pan
  0 siblings, 1 reply; 5+ messages in thread
From: Petr Mladek @ 2016-08-04 14:56 UTC (permalink / raw)
  To: Zhang Rui, Eduardo Valentin, Jacob Pan
  Cc: linux-pm, linux-kernel, Petr Mladek

I have got a zero division error when disabling the forced
idle injection from the intel powerclamp. I did

  echo 0 >/sys/class/thermal/cooling_device48/cur_state

and got

[  986.072632] divide error: 0000 [#1] PREEMPT SMP
[  986.078989] Modules linked in:
[  986.083618] CPU: 17 PID: 24967 Comm: kidle_inject/17 Not tainted 4.7.0-1-default+ #3055
[  986.093781] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.R3.27.D685.1305151734 05/15/2013
[  986.106227] task: ffff880430e1c080 task.stack: ffff880427ef0000
[  986.114122] RIP: 0010:[<ffffffff81794859>]  [<ffffffff81794859>] clamp_thread+0x1d9/0x600
[  986.124609] RSP: 0018:ffff880427ef3e20  EFLAGS: 00010246
[  986.131860] RAX: 0000000000000258 RBX: 0000000000000006 RCX: 0000000000000001
[  986.141179] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000018
[  986.150478] RBP: ffff880427ef3ec8 R08: ffff880427ef0000 R09: 0000000000000002
[  986.159779] R10: 0000000000003df2 R11: 0000000000000018 R12: 0000000000000002
[  986.169089] R13: 0000000000000000 R14: ffff880427ef0000 R15: ffff880427ef0000
[  986.178388] FS:  0000000000000000(0000) GS:ffff880435940000(0000) knlGS:0000000000000000
[  986.188785] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  986.196559] CR2: 00007f1d0caf0000 CR3: 0000000002006000 CR4: 00000000001406e0
[  986.205909] Stack:
[  986.209524]  ffff8802be897b00 ffff880430e1c080 0000000000000011 0000006a35959780
[  986.219236]  0000000000000011 ffff880427ef0008 0000000000000000 ffff8804359503d0
[  986.228966]  0000000100029d93 ffffffff81794140 0000000000000000 ffffffff05000011
[  986.238686] Call Trace:
[  986.242825]  [<ffffffff81794140>] ? pkg_state_counter+0x80/0x80
[  986.250866]  [<ffffffff81794680>] ? powerclamp_set_cur_state+0x180/0x180
[  986.259797]  [<ffffffff8111d1a9>] kthread+0xc9/0xe0
[  986.266682]  [<ffffffff8193d69f>] ret_from_fork+0x1f/0x40
[  986.274142]  [<ffffffff8111d0e0>] ? kthread_create_on_node+0x180/0x180
[  986.282869] Code: d1 ea 48 89 d6 80 3d 6a d0 d4 00 00 ba 64 00 00 00 89 d8 41 0f 45 f5 0f af c2 42 8d 14 2e be 31 00 00 00 83 fa 31 0f 42 f2 31 d2 <f7> f6 48 8b 15 9e 07 87 00 48 8b 3d 97 07 87 00 48 63 f0 83 e8
[  986.307806] RIP  [<ffffffff81794859>] clamp_thread+0x1d9/0x600
[  986.315871]  RSP <ffff880427ef3e20>

RIP points to the following lines:

	compensation = get_compensation(target_ratio);
	interval = duration_jiffies*100/(target_ratio+compensation);

A solution would be to switch the following two commands in
powerclamp_set_cur_state():

	set_target_ratio = 0;
	end_power_clamp();

But I think that the zero division might happen also when target_ratio
is non-zero because the compensation might be negative. Therefore
it is better to check the sum of target_ratio and compensation
explicitly.

Also the compensated_ratio variable is always set. Therefore there
is no need to initialize it.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 drivers/thermal/intel_powerclamp.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/intel_powerclamp.c b/drivers/thermal/intel_powerclamp.c
index 015ce2eb6eb7..65b865184990 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -388,7 +388,7 @@ static int clamp_thread(void *arg)
 		int sleeptime;
 		unsigned long target_jiffies;
 		unsigned int guard;
-		unsigned int compensation = 0;
+		unsigned int compensated_ratio;
 		int interval; /* jiffies to sleep for each attempt */
 		unsigned int duration_jiffies = msecs_to_jiffies(duration);
 		unsigned int window_size_now;
@@ -409,8 +409,11 @@ static int clamp_thread(void *arg)
 		 * c-states, thus we need to compensate the injected idle ratio
 		 * to achieve the actual target reported by the HW.
 		 */
-		compensation = get_compensation(target_ratio);
-		interval = duration_jiffies*100/(target_ratio+compensation);
+		compensated_ratio = target_ratio +
+			get_compensation(target_ratio);
+		if (compensated_ratio <= 0)
+			compensated_ratio = 1;
+		interval = duration_jiffies * 100 / compensated_ratio;
 
 		/* align idle time */
 		target_jiffies = roundup(jiffies, interval);
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] thermal/powerclamp: Prevent division by zero when counting interval
  2016-08-04 14:56 [PATCH] thermal/powerclamp: Prevent division by zero when counting interval Petr Mladek
@ 2016-08-04 17:32 ` Jacob Pan
  2016-08-05 13:12   ` Petr Mladek
  0 siblings, 1 reply; 5+ messages in thread
From: Jacob Pan @ 2016-08-04 17:32 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Zhang Rui, Eduardo Valentin, linux-pm, linux-kernel,
	jacob.jun.pan

On Thu,  4 Aug 2016 16:56:46 +0200
Petr Mladek <pmladek@suse.com> wrote:

> I have got a zero division error when disabling the forced
> idle injection from the intel powerclamp. I did
> 
>   echo 0 >/sys/class/thermal/cooling_device48/cur_state
> 
> and got
> 
> [  986.072632] divide error: 0000 [#1] PREEMPT SMP
> [  986.078989] Modules linked in:
> [  986.083618] CPU: 17 PID: 24967 Comm: kidle_inject/17 Not tainted
> 4.7.0-1-default+ #3055 [  986.093781] Hardware name: Intel
> Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.R3.27.D685.1305151734
> 05/15/2013 [  986.106227] task: ffff880430e1c080 task.stack:
> ffff880427ef0000 [  986.114122] RIP: 0010:[<ffffffff81794859>]
> [<ffffffff81794859>] clamp_thread+0x1d9/0x600 [  986.124609] RSP:
> 0018:ffff880427ef3e20  EFLAGS: 00010246 [  986.131860] RAX:
> 0000000000000258 RBX: 0000000000000006 RCX: 0000000000000001
> [  986.141179] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> 0000000000000018 [  986.150478] RBP: ffff880427ef3ec8 R08:
> ffff880427ef0000 R09: 0000000000000002 [  986.159779] R10:
> 0000000000003df2 R11: 0000000000000018 R12: 0000000000000002
> [  986.169089] R13: 0000000000000000 R14: ffff880427ef0000 R15:
> ffff880427ef0000 [  986.178388] FS:  0000000000000000(0000)
> GS:ffff880435940000(0000) knlGS:0000000000000000 [  986.188785] CS:
> 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [  986.196559] CR2:
> 00007f1d0caf0000 CR3: 0000000002006000 CR4: 00000000001406e0
> [  986.205909] Stack: [  986.209524]  ffff8802be897b00
> ffff880430e1c080 0000000000000011 0000006a35959780 [  986.219236]
> 0000000000000011 ffff880427ef0008 0000000000000000 ffff8804359503d0
> [  986.228966]  0000000100029d93 ffffffff81794140 0000000000000000
> ffffffff05000011 [  986.238686] Call Trace: [  986.242825]
> [<ffffffff81794140>] ? pkg_state_counter+0x80/0x80 [  986.250866]
> [<ffffffff81794680>] ? powerclamp_set_cur_state+0x180/0x180
> [  986.259797]  [<ffffffff8111d1a9>] kthread+0xc9/0xe0
> [  986.266682]  [<ffffffff8193d69f>] ret_from_fork+0x1f/0x40
> [  986.274142]  [<ffffffff8111d0e0>] ?
> kthread_create_on_node+0x180/0x180 [  986.282869] Code: d1 ea 48 89
> d6 80 3d 6a d0 d4 00 00 ba 64 00 00 00 89 d8 41 0f 45 f5 0f af c2 42
> 8d 14 2e be 31 00 00 00 83 fa 31 0f 42 f2 31 d2 <f7> f6 48 8b 15 9e
> 07 87 00 48 8b 3d 97 07 87 00 48 63 f0 83 e8 [  986.307806] RIP
> [<ffffffff81794859>] clamp_thread+0x1d9/0x600 [  986.315871]  RSP
> <ffff880427ef3e20>
> 
> RIP points to the following lines:
> 
> 	compensation = get_compensation(target_ratio);
> 	interval = duration_jiffies*100/(target_ratio+compensation);
> 
> A solution would be to switch the following two commands in
> powerclamp_set_cur_state():
> 
> 	set_target_ratio = 0;
> 	end_power_clamp();
> 
I see, there is race condition, clamping threads should be stopped if
target ratio is 0.
> But I think that the zero division might happen also when target_ratio
> is non-zero because the compensation might be negative. Therefore
> it is better to check the sum of target_ratio and compensation
> explicitly.
> 
compensation should never be negative. since it is the additional idle
ratio added on top of requested ratio. If actual idle is more than
requested, we will skip injection period.
So i prefer to have both changes.
> Also the compensated_ratio variable is always set. Therefore there
> is no need to initialize it.
> 
> Signed-off-by: Petr Mladek <pmladek@suse.com>
> ---
>  drivers/thermal/intel_powerclamp.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/thermal/intel_powerclamp.c
> b/drivers/thermal/intel_powerclamp.c index 015ce2eb6eb7..65b865184990
> 100644 --- a/drivers/thermal/intel_powerclamp.c
> +++ b/drivers/thermal/intel_powerclamp.c
> @@ -388,7 +388,7 @@ static int clamp_thread(void *arg)
>  		int sleeptime;
>  		unsigned long target_jiffies;
>  		unsigned int guard;
> -		unsigned int compensation = 0;
> +		unsigned int compensated_ratio;
>  		int interval; /* jiffies to sleep for each attempt */
>  		unsigned int duration_jiffies =
> msecs_to_jiffies(duration); unsigned int window_size_now;
> @@ -409,8 +409,11 @@ static int clamp_thread(void *arg)
>  		 * c-states, thus we need to compensate the injected
> idle ratio
>  		 * to achieve the actual target reported by the HW.
>  		 */
> -		compensation = get_compensation(target_ratio);
> -		interval =
> duration_jiffies*100/(target_ratio+compensation);
> +		compensated_ratio = target_ratio +
> +			get_compensation(target_ratio);
> +		if (compensated_ratio <= 0)
> +			compensated_ratio = 1;
> +		interval = duration_jiffies * 100 /
> compensated_ratio; 
>  		/* align idle time */
>  		target_jiffies = roundup(jiffies, interval);

[Jacob Pan]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] thermal/powerclamp: Prevent division by zero when counting interval
  2016-08-04 17:32 ` Jacob Pan
@ 2016-08-05 13:12   ` Petr Mladek
  2016-08-05 13:20     ` [PATCH v2] " Petr Mladek
  0 siblings, 1 reply; 5+ messages in thread
From: Petr Mladek @ 2016-08-05 13:12 UTC (permalink / raw)
  To: Jacob Pan; +Cc: Zhang Rui, Eduardo Valentin, linux-pm, linux-kernel

On Thu 2016-08-04 10:32:00, Jacob Pan wrote:
> On Thu,  4 Aug 2016 16:56:46 +0200
> Petr Mladek <pmladek@suse.com> wrote:
> 
> > I have got a zero division error when disabling the forced
> > idle injection from the intel powerclamp. I did
> > 
> >   echo 0 >/sys/class/thermal/cooling_device48/cur_state
> > 
> > and got
> > 
> > [  986.072632] divide error: 0000 [#1] PREEMPT SMP
> > [  986.078989] Modules linked in:
> > [  986.083618] CPU: 17 PID: 24967 Comm: kidle_inject/17 Not tainted
> > 4.7.0-1-default+ #3055 [  986.093781] Hardware name: Intel
> > Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.R3.27.D685.1305151734
> > 05/15/2013 [  986.106227] task: ffff880430e1c080 task.stack:
> > ffff880427ef0000 [  986.114122] RIP: 0010:[<ffffffff81794859>]
> > [<ffffffff81794859>] clamp_thread+0x1d9/0x600 [  986.124609] RSP:
> > 0018:ffff880427ef3e20  EFLAGS: 00010246 [  986.131860] RAX:
> > 0000000000000258 RBX: 0000000000000006 RCX: 0000000000000001
> > [  986.141179] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> > 0000000000000018 [  986.150478] RBP: ffff880427ef3ec8 R08:
> > ffff880427ef0000 R09: 0000000000000002 [  986.159779] R10:
> > 0000000000003df2 R11: 0000000000000018 R12: 0000000000000002
> > [  986.169089] R13: 0000000000000000 R14: ffff880427ef0000 R15:
> > ffff880427ef0000 [  986.178388] FS:  0000000000000000(0000)
> > GS:ffff880435940000(0000) knlGS:0000000000000000 [  986.188785] CS:
> > 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [  986.196559] CR2:
> > 00007f1d0caf0000 CR3: 0000000002006000 CR4: 00000000001406e0
> > [  986.205909] Stack: [  986.209524]  ffff8802be897b00
> > ffff880430e1c080 0000000000000011 0000006a35959780 [  986.219236]
> > 0000000000000011 ffff880427ef0008 0000000000000000 ffff8804359503d0
> > [  986.228966]  0000000100029d93 ffffffff81794140 0000000000000000
> > ffffffff05000011 [  986.238686] Call Trace: [  986.242825]
> > [<ffffffff81794140>] ? pkg_state_counter+0x80/0x80 [  986.250866]
> > [<ffffffff81794680>] ? powerclamp_set_cur_state+0x180/0x180
> > [  986.259797]  [<ffffffff8111d1a9>] kthread+0xc9/0xe0
> > [  986.266682]  [<ffffffff8193d69f>] ret_from_fork+0x1f/0x40
> > [  986.274142]  [<ffffffff8111d0e0>] ?
> > kthread_create_on_node+0x180/0x180 [  986.282869] Code: d1 ea 48 89
> > d6 80 3d 6a d0 d4 00 00 ba 64 00 00 00 89 d8 41 0f 45 f5 0f af c2 42
> > 8d 14 2e be 31 00 00 00 83 fa 31 0f 42 f2 31 d2 <f7> f6 48 8b 15 9e
> > 07 87 00 48 8b 3d 97 07 87 00 48 63 f0 83 e8 [  986.307806] RIP
> > [<ffffffff81794859>] clamp_thread+0x1d9/0x600 [  986.315871]  RSP
> > <ffff880427ef3e20>
> > 
> > RIP points to the following lines:
> > 
> > 	compensation = get_compensation(target_ratio);
> > 	interval = duration_jiffies*100/(target_ratio+compensation);
> > 
> > A solution would be to switch the following two commands in
> > powerclamp_set_cur_state():
> > 
> > 	set_target_ratio = 0;
> > 	end_power_clamp();
> > 
> I see, there is race condition, clamping threads should be stopped if
> target ratio is 0.
> > But I think that the zero division might happen also when target_ratio
> > is non-zero because the compensation might be negative. Therefore
> > it is better to check the sum of target_ratio and compensation
> > explicitly.
> > 
> compensation should never be negative. since it is the additional idle
> ratio added on top of requested ratio.

I am not sure if you are talking about the desired behavior or the
current code. get_compensation() returns value computed from
steady_comp values. These values are assigned in adjust_compensation()
and the code seems to store even negative values. But I did not
tried to investigate it much deeper.

> If actual idle is more than requested, we will skip injection period.
> So i prefer to have both changes.

OK, I'll send an updated patch.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] thermal/powerclamp: Prevent division by zero when counting interval
  2016-08-05 13:12   ` Petr Mladek
@ 2016-08-05 13:20     ` Petr Mladek
  2016-08-05 15:42       ` Jacob Pan
  0 siblings, 1 reply; 5+ messages in thread
From: Petr Mladek @ 2016-08-05 13:20 UTC (permalink / raw)
  To: Zhang Rui, Eduardo Valentin, Jacob Pan
  Cc: linux-pm, linux-kernel, Petr Mladek

I have got a zero division error when disabling the forced
idle injection from the intel powerclamp. I did

  echo 0 >/sys/class/thermal/cooling_device48/cur_state

and got

[  986.072632] divide error: 0000 [#1] PREEMPT SMP
[  986.078989] Modules linked in:
[  986.083618] CPU: 17 PID: 24967 Comm: kidle_inject/17 Not tainted 4.7.0-1-default+ #3055
[  986.093781] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.R3.27.D685.1305151734 05/15/2013
[  986.106227] task: ffff880430e1c080 task.stack: ffff880427ef0000
[  986.114122] RIP: 0010:[<ffffffff81794859>]  [<ffffffff81794859>] clamp_thread+0x1d9/0x600
[  986.124609] RSP: 0018:ffff880427ef3e20  EFLAGS: 00010246
[  986.131860] RAX: 0000000000000258 RBX: 0000000000000006 RCX: 0000000000000001
[  986.141179] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000018
[  986.150478] RBP: ffff880427ef3ec8 R08: ffff880427ef0000 R09: 0000000000000002
[  986.159779] R10: 0000000000003df2 R11: 0000000000000018 R12: 0000000000000002
[  986.169089] R13: 0000000000000000 R14: ffff880427ef0000 R15: ffff880427ef0000
[  986.178388] FS:  0000000000000000(0000) GS:ffff880435940000(0000) knlGS:0000000000000000
[  986.188785] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  986.196559] CR2: 00007f1d0caf0000 CR3: 0000000002006000 CR4: 00000000001406e0
[  986.205909] Stack:
[  986.209524]  ffff8802be897b00 ffff880430e1c080 0000000000000011 0000006a35959780
[  986.219236]  0000000000000011 ffff880427ef0008 0000000000000000 ffff8804359503d0
[  986.228966]  0000000100029d93 ffffffff81794140 0000000000000000 ffffffff05000011
[  986.238686] Call Trace:
[  986.242825]  [<ffffffff81794140>] ? pkg_state_counter+0x80/0x80
[  986.250866]  [<ffffffff81794680>] ? powerclamp_set_cur_state+0x180/0x180
[  986.259797]  [<ffffffff8111d1a9>] kthread+0xc9/0xe0
[  986.266682]  [<ffffffff8193d69f>] ret_from_fork+0x1f/0x40
[  986.274142]  [<ffffffff8111d0e0>] ? kthread_create_on_node+0x180/0x180
[  986.282869] Code: d1 ea 48 89 d6 80 3d 6a d0 d4 00 00 ba 64 00 00 00 89 d8 41 0f 45 f5 0f af c2 42 8d 14 2e be 31 00 00 00 83 fa 31 0f 42 f2 31 d2 <f7> f6 48 8b 15 9e 07 87 00 48 8b 3d 97 07 87 00 48 63 f0 83 e8
[  986.307806] RIP  [<ffffffff81794859>] clamp_thread+0x1d9/0x600
[  986.315871]  RSP <ffff880427ef3e20>

RIP points to the following lines:

	compensation = get_compensation(target_ratio);
	interval = duration_jiffies*100/(target_ratio+compensation);

A solution would be to switch the following two commands in
powerclamp_set_cur_state():

	set_target_ratio = 0;
	end_power_clamp();

But I think that the zero division might happen also when target_ratio
is non-zero because the compensation might be negative. Therefore
we also check the sum of target_ratio and compensation explicitly.

Also the compensated_ratio variable is always set. Therefore there
is no need to initialize it.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
Changes against v1:

	+ Also set_target_ratio to 0 after the threads are stopped.

 drivers/thermal/intel_powerclamp.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/thermal/intel_powerclamp.c b/drivers/thermal/intel_powerclamp.c
index 015ce2eb6eb7..0e4dc0afcfd2 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -388,7 +388,7 @@ static int clamp_thread(void *arg)
 		int sleeptime;
 		unsigned long target_jiffies;
 		unsigned int guard;
-		unsigned int compensation = 0;
+		unsigned int compensated_ratio;
 		int interval; /* jiffies to sleep for each attempt */
 		unsigned int duration_jiffies = msecs_to_jiffies(duration);
 		unsigned int window_size_now;
@@ -409,8 +409,11 @@ static int clamp_thread(void *arg)
 		 * c-states, thus we need to compensate the injected idle ratio
 		 * to achieve the actual target reported by the HW.
 		 */
-		compensation = get_compensation(target_ratio);
-		interval = duration_jiffies*100/(target_ratio+compensation);
+		compensated_ratio = target_ratio +
+			get_compensation(target_ratio);
+		if (compensated_ratio <= 0)
+			compensated_ratio = 1;
+		interval = duration_jiffies * 100 / compensated_ratio;
 
 		/* align idle time */
 		target_jiffies = roundup(jiffies, interval);
@@ -647,8 +650,8 @@ static int powerclamp_set_cur_state(struct thermal_cooling_device *cdev,
 		goto exit_set;
 	} else	if (set_target_ratio > 0 && new_target_ratio == 0) {
 		pr_info("Stop forced idle injection\n");
-		set_target_ratio = 0;
 		end_power_clamp();
+		set_target_ratio = 0;
 	} else	/* adjust currently running */ {
 		set_target_ratio = new_target_ratio;
 		/* make new set_target_ratio visible to other cpus */
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] thermal/powerclamp: Prevent division by zero when counting interval
  2016-08-05 13:20     ` [PATCH v2] " Petr Mladek
@ 2016-08-05 15:42       ` Jacob Pan
  0 siblings, 0 replies; 5+ messages in thread
From: Jacob Pan @ 2016-08-05 15:42 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Zhang Rui, Eduardo Valentin, linux-pm, linux-kernel,
	jacob.jun.pan

On Fri,  5 Aug 2016 15:20:41 +0200
Petr Mladek <pmladek@suse.com> wrote:

> I have got a zero division error when disabling the forced
> idle injection from the intel powerclamp. I did
> 
>   echo 0 >/sys/class/thermal/cooling_device48/cur_state
> 
> and got
> 
> [  986.072632] divide error: 0000 [#1] PREEMPT SMP
> [  986.078989] Modules linked in:
> [  986.083618] CPU: 17 PID: 24967 Comm: kidle_inject/17 Not tainted
> 4.7.0-1-default+ #3055 [  986.093781] Hardware name: Intel
> Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.R3.27.D685.1305151734
> 05/15/2013 [  986.106227] task: ffff880430e1c080 task.stack:
> ffff880427ef0000 [  986.114122] RIP: 0010:[<ffffffff81794859>]
> [<ffffffff81794859>] clamp_thread+0x1d9/0x600 [  986.124609] RSP:
> 0018:ffff880427ef3e20  EFLAGS: 00010246 [  986.131860] RAX:
> 0000000000000258 RBX: 0000000000000006 RCX: 0000000000000001
> [  986.141179] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> 0000000000000018 [  986.150478] RBP: ffff880427ef3ec8 R08:
> ffff880427ef0000 R09: 0000000000000002 [  986.159779] R10:
> 0000000000003df2 R11: 0000000000000018 R12: 0000000000000002
> [  986.169089] R13: 0000000000000000 R14: ffff880427ef0000 R15:
> ffff880427ef0000 [  986.178388] FS:  0000000000000000(0000)
> GS:ffff880435940000(0000) knlGS:0000000000000000 [  986.188785] CS:
> 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [  986.196559] CR2:
> 00007f1d0caf0000 CR3: 0000000002006000 CR4: 00000000001406e0
> [  986.205909] Stack: [  986.209524]  ffff8802be897b00
> ffff880430e1c080 0000000000000011 0000006a35959780 [  986.219236]
> 0000000000000011 ffff880427ef0008 0000000000000000 ffff8804359503d0
> [  986.228966]  0000000100029d93 ffffffff81794140 0000000000000000
> ffffffff05000011 [  986.238686] Call Trace: [  986.242825]
> [<ffffffff81794140>] ? pkg_state_counter+0x80/0x80 [  986.250866]
> [<ffffffff81794680>] ? powerclamp_set_cur_state+0x180/0x180
> [  986.259797]  [<ffffffff8111d1a9>] kthread+0xc9/0xe0
> [  986.266682]  [<ffffffff8193d69f>] ret_from_fork+0x1f/0x40
> [  986.274142]  [<ffffffff8111d0e0>] ?
> kthread_create_on_node+0x180/0x180 [  986.282869] Code: d1 ea 48 89
> d6 80 3d 6a d0 d4 00 00 ba 64 00 00 00 89 d8 41 0f 45 f5 0f af c2 42
> 8d 14 2e be 31 00 00 00 83 fa 31 0f 42 f2 31 d2 <f7> f6 48 8b 15 9e
> 07 87 00 48 8b 3d 97 07 87 00 48 63 f0 83 e8 [  986.307806] RIP
> [<ffffffff81794859>] clamp_thread+0x1d9/0x600 [  986.315871]  RSP
> <ffff880427ef3e20>
> 
> RIP points to the following lines:
> 
> 	compensation = get_compensation(target_ratio);
> 	interval = duration_jiffies*100/(target_ratio+compensation);
> 
> A solution would be to switch the following two commands in
> powerclamp_set_cur_state():
> 
> 	set_target_ratio = 0;
> 	end_power_clamp();
> 
> But I think that the zero division might happen also when target_ratio
> is non-zero because the compensation might be negative. Therefore
> we also check the sum of target_ratio and compensation explicitly.
> 
> Also the compensated_ratio variable is always set. Therefore there
> is no need to initialize it.
> 
> Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com>

Thanks
> ---
> Changes against v1:
> 
> 	+ Also set_target_ratio to 0 after the threads are stopped.
> 
>  drivers/thermal/intel_powerclamp.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/thermal/intel_powerclamp.c
> b/drivers/thermal/intel_powerclamp.c index 015ce2eb6eb7..0e4dc0afcfd2
> 100644 --- a/drivers/thermal/intel_powerclamp.c
> +++ b/drivers/thermal/intel_powerclamp.c
> @@ -388,7 +388,7 @@ static int clamp_thread(void *arg)
>  		int sleeptime;
>  		unsigned long target_jiffies;
>  		unsigned int guard;
> -		unsigned int compensation = 0;
> +		unsigned int compensated_ratio;
>  		int interval; /* jiffies to sleep for each attempt */
>  		unsigned int duration_jiffies =
> msecs_to_jiffies(duration); unsigned int window_size_now;
> @@ -409,8 +409,11 @@ static int clamp_thread(void *arg)
>  		 * c-states, thus we need to compensate the injected
> idle ratio
>  		 * to achieve the actual target reported by the HW.
>  		 */
> -		compensation = get_compensation(target_ratio);
> -		interval =
> duration_jiffies*100/(target_ratio+compensation);
> +		compensated_ratio = target_ratio +
> +			get_compensation(target_ratio);
> +		if (compensated_ratio <= 0)
> +			compensated_ratio = 1;
> +		interval = duration_jiffies * 100 /
> compensated_ratio; 
>  		/* align idle time */
>  		target_jiffies = roundup(jiffies, interval);
> @@ -647,8 +650,8 @@ static int powerclamp_set_cur_state(struct
> thermal_cooling_device *cdev, goto exit_set;
>  	} else	if (set_target_ratio > 0 && new_target_ratio
> == 0) { pr_info("Stop forced idle injection\n");
> -		set_target_ratio = 0;
>  		end_power_clamp();
> +		set_target_ratio = 0;
>  	} else	/* adjust currently running */ {
>  		set_target_ratio = new_target_ratio;
>  		/* make new set_target_ratio visible to other cpus */

[Jacob Pan]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-08-05 15:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-04 14:56 [PATCH] thermal/powerclamp: Prevent division by zero when counting interval Petr Mladek
2016-08-04 17:32 ` Jacob Pan
2016-08-05 13:12   ` Petr Mladek
2016-08-05 13:20     ` [PATCH v2] " Petr Mladek
2016-08-05 15:42       ` Jacob Pan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).