public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched: fair-group: fix a Div0 error of the fair group scheduler
@ 2008-04-28  4:54 Miao Xie
  2008-04-28  5:45 ` Peter Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Miao Xie @ 2008-04-28  4:54 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra; +Cc: Linux-Kernel

When I echoed 0 into the "cpu.shares" file, a Div0 error occured.

We found it is caused by the following calling.

   sched_group_set_shares(tg, shares)
       set_se_shares(tg->se[i], shares/nr_cpu_ids)
           __set_se_shares(se, shares)
               div64_64((1ULL<<32), shares)

When the echoed value was less than the number of processores, the result of the
sentence "shares/nr_cpu_ids" was 0, and then the system called div64() to divide
the result, the Div0 error occured.

It is unnecessary that the shares value is divided by nr_cpu_ids, I think.
Because in the function  __update_group_shares_cpu() and init_tg_cfs_entry(),
the shares value isn't divided by nr_cpu_ids when setting shares of the sched
entity.

This patch fixes this bug. And echoing ULONG_MAX value into cpu.shares also
causes Div0 error, so we set a macro MAX_SHARES to limit the max value of
shares.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>

---
  kernel/sched.c |   17 +++++++++++------
  1 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 740fb40..aa1bb81 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -318,7 +318,13 @@ static DEFINE_MUTEX(doms_cur_mutex);
  # define INIT_TASK_GROUP_LOAD	NICE_0_LOAD
  #endif

+/*
+ * A weight of 0, 1 or ULONG_MAX can cause arithmetics problems.
+ * (The default weight is 1024 - so there's no practical
+ *  limitation from this.)
+ */
  #define MIN_SHARES	2
+#define MAX_SHARES	(ULONG_MAX - 1)

  static int init_task_group_load = INIT_TASK_GROUP_LOAD;
  #endif
@@ -1748,6 +1754,8 @@ __update_group_shares_cpu(struct task_group *tg, struct sched_domain *sd,

  	if (shares < MIN_SHARES)
  		shares = MIN_SHARES;
+	else if (shares > MAX_SHARES)
+		shares = MAX_SHARES;

  	__set_se_shares(tg->se[tcpu], shares);
  }
@@ -8722,13 +8730,10 @@ int sched_group_set_shares(struct task_group *tg, unsigned long shares)
  	if (!tg->se[0])
  		return -EINVAL;

-	/*
-	 * A weight of 0 or 1 can cause arithmetics problems.
-	 * (The default weight is 1024 - so there's no practical
-	 *  limitation from this.)
-	 */
  	if (shares < MIN_SHARES)
  		shares = MIN_SHARES;
+	else if (shares > MAX_SHARES)
+		shares = MAX_SHARES;

  	mutex_lock(&shares_mutex);
  	if (tg->shares == shares)
@@ -8753,7 +8758,7 @@ int sched_group_set_shares(struct task_group *tg, unsigned long shares)
  		 * force a rebalance
  		 */
  		cfs_rq_set_shares(tg->cfs_rq[i], 0);
-		set_se_shares(tg->se[i], shares/nr_cpu_ids);
+		set_se_shares(tg->se[i], shares);
  	}

  	/*
-- 
1.5.4.rc3



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] sched: fair-group: fix a Div0 error of the fair group scheduler
  2008-04-28  4:54 [PATCH] sched: fair-group: fix a Div0 error of the fair group scheduler Miao Xie
@ 2008-04-28  5:45 ` Peter Zijlstra
  2008-04-28  8:27   ` Miao Xie
  2008-04-28  8:34 ` Peter Zijlstra
  2008-04-28 12:51 ` Ingo Molnar
  2 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2008-04-28  5:45 UTC (permalink / raw)
  To: miaox; +Cc: Ingo Molnar, Linux-Kernel

On Mon, 2008-04-28 at 12:54 +0800, Miao Xie wrote:
> When I echoed 0 into the "cpu.shares" file, a Div0 error occured.
> 
> We found it is caused by the following calling.
> 
>    sched_group_set_shares(tg, shares)
>        set_se_shares(tg->se[i], shares/nr_cpu_ids)
>            __set_se_shares(se, shares)
>                div64_64((1ULL<<32), shares)
> 
> When the echoed value was less than the number of processores, the result of the
> sentence "shares/nr_cpu_ids" was 0, and then the system called div64() to divide
> the result, the Div0 error occured.
> 
> It is unnecessary that the shares value is divided by nr_cpu_ids, I think.
> Because in the function  __update_group_shares_cpu() and init_tg_cfs_entry(),
> the shares value isn't divided by nr_cpu_ids when setting shares of the sched
> entity.
> 
> This patch fixes this bug. And echoing ULONG_MAX value into cpu.shares also
> causes Div0 error, so we set a macro MAX_SHARES to limit the max value of
> shares.

how about:

diff --git a/kernel/sched.c b/kernel/sched.c
index 740fb40..b68127a 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -8025,7 +8025,7 @@ static void init_tg_cfs_entry(struct task_group *tg, struct cfs_rq *cfs_rq,
 
 	se->my_q = cfs_rq;
 	se->load.weight = tg->shares;
-	se->load.inv_weight = div64_64(1ULL<<32, se->load.weight);
+	se->load.inv_weight = 0;
 	se->parent = parent;
 }
 #endif
@@ -8692,7 +8692,7 @@ static void __set_se_shares(struct sched_entity *se, unsigned long shares)
 		dequeue_entity(cfs_rq, se, 0);
 
 	se->load.weight = shares;
-	se->load.inv_weight = div64_64((1ULL<<32), shares);
+	se->load.inv_weight = 0;
 
 	if (on_rq)
 		enqueue_entity(cfs_rq, se, 0);



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] sched: fair-group: fix a Div0 error of the fair group scheduler
  2008-04-28  5:45 ` Peter Zijlstra
@ 2008-04-28  8:27   ` Miao Xie
  2008-04-28  8:33     ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Miao Xie @ 2008-04-28  8:27 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Linux-Kernel

on 2008-4-28 13:45 Peter Zijlstra wrote:
> On Mon, 2008-04-28 at 12:54 +0800, Miao Xie wrote:
>> When I echoed 0 into the "cpu.shares" file, a Div0 error occured.
>>
>> We found it is caused by the following calling.
>>
>>    sched_group_set_shares(tg, shares)
>>        set_se_shares(tg->se[i], shares/nr_cpu_ids)
>>            __set_se_shares(se, shares)
>>                div64_64((1ULL<<32), shares)
>>
>> When the echoed value was less than the number of processores, the result of the
>> sentence "shares/nr_cpu_ids" was 0, and then the system called div64() to divide
>> the result, the Div0 error occured.
>>
>> It is unnecessary that the shares value is divided by nr_cpu_ids, I think.
>> Because in the function  __update_group_shares_cpu() and init_tg_cfs_entry(),
>> the shares value isn't divided by nr_cpu_ids when setting shares of the sched
>> entity.
>>
>> This patch fixes this bug. And echoing ULONG_MAX value into cpu.shares also
>> causes Div0 error, so we set a macro MAX_SHARES to limit the max value of
>> shares.
> 
> how about:
> 
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 740fb40..b68127a 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -8025,7 +8025,7 @@ static void init_tg_cfs_entry(struct task_group *tg, struct cfs_rq *cfs_rq,
>  
>  	se->my_q = cfs_rq;
>  	se->load.weight = tg->shares;
> -	se->load.inv_weight = div64_64(1ULL<<32, se->load.weight);
> +	se->load.inv_weight = 0;
>  	se->parent = parent;
>  }
>  #endif
> @@ -8692,7 +8692,7 @@ static void __set_se_shares(struct sched_entity *se, unsigned long shares)
>  		dequeue_entity(cfs_rq, se, 0);
>  
>  	se->load.weight = shares;
> -	se->load.inv_weight = div64_64((1ULL<<32), shares);
> +	se->load.inv_weight = 0;
>  
>  	if (on_rq)
>  		enqueue_entity(cfs_rq, se, 0);
> 

I'm sorry, I didn't explained clearly.Though echoing ULONG_MAX value into
cpu.shares causes Div0 error, this error was not caused by the above code.
It is caused by the following code.

    calc_delta_mine()
        ->lw->inv_weight = (WMULT_CONST-lw->weight/2) / (lw->weight+1);

And besides, the Div0 error caused by echoing ULONG_MAX occured on the UP system
or on SMP system with only one cpu.

So this patch fixes the bug caused by echoing a number which was less than the number
of processores into the "cpu.shares" file, but doesn't fix the bug caused by echoing
ULONG_MAX.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] sched: fair-group: fix a Div0 error of the fair group scheduler
  2008-04-28  8:27   ` Miao Xie
@ 2008-04-28  8:33     ` Peter Zijlstra
  0 siblings, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2008-04-28  8:33 UTC (permalink / raw)
  To: miaox; +Cc: Ingo Molnar, Linux-Kernel

On Mon, 2008-04-28 at 16:27 +0800, Miao Xie wrote:

> I'm sorry, I didn't explained clearly.Though echoing ULONG_MAX value into
> cpu.shares causes Div0 error, this error was not caused by the above code.
> It is caused by the following code.
> 
>     calc_delta_mine()
>         ->lw->inv_weight = (WMULT_CONST-lw->weight/2) / (lw->weight+1);
> 
> And besides, the Div0 error caused by echoing ULONG_MAX occured on the UP system
> or on SMP system with only one cpu.
> 
> So this patch fixes the bug caused by echoing a number which was less than the number
> of processores into the "cpu.shares" file, but doesn't fix the bug caused by echoing
> ULONG_MAX.

Ah, ok. But I guess we run into other problems with weights that heavy,
but we should not crash.

Ok, I'll ACK your initial patch


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] sched: fair-group: fix a Div0 error of the fair group scheduler
  2008-04-28  4:54 [PATCH] sched: fair-group: fix a Div0 error of the fair group scheduler Miao Xie
  2008-04-28  5:45 ` Peter Zijlstra
@ 2008-04-28  8:34 ` Peter Zijlstra
  2008-04-28 12:51 ` Ingo Molnar
  2 siblings, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2008-04-28  8:34 UTC (permalink / raw)
  To: miaox; +Cc: Ingo Molnar, Linux-Kernel

On Mon, 2008-04-28 at 12:54 +0800, Miao Xie wrote:
> When I echoed 0 into the "cpu.shares" file, a Div0 error occured.
> 
> We found it is caused by the following calling.
> 
>    sched_group_set_shares(tg, shares)
>        set_se_shares(tg->se[i], shares/nr_cpu_ids)
>            __set_se_shares(se, shares)
>                div64_64((1ULL<<32), shares)
> 
> When the echoed value was less than the number of processores, the result of the
> sentence "shares/nr_cpu_ids" was 0, and then the system called div64() to divide
> the result, the Div0 error occured.
> 
> It is unnecessary that the shares value is divided by nr_cpu_ids, I think.
> Because in the function  __update_group_shares_cpu() and init_tg_cfs_entry(),
> the shares value isn't divided by nr_cpu_ids when setting shares of the sched
> entity.
> 
> This patch fixes this bug. And echoing ULONG_MAX value into cpu.shares also
> causes Div0 error, so we set a macro MAX_SHARES to limit the max value of
> shares.
> 
> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>

> ---
>   kernel/sched.c |   17 +++++++++++------
>   1 files changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 740fb40..aa1bb81 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -318,7 +318,13 @@ static DEFINE_MUTEX(doms_cur_mutex);
>   # define INIT_TASK_GROUP_LOAD	NICE_0_LOAD
>   #endif
> 
> +/*
> + * A weight of 0, 1 or ULONG_MAX can cause arithmetics problems.
> + * (The default weight is 1024 - so there's no practical
> + *  limitation from this.)
> + */
>   #define MIN_SHARES	2
> +#define MAX_SHARES	(ULONG_MAX - 1)
> 
>   static int init_task_group_load = INIT_TASK_GROUP_LOAD;
>   #endif
> @@ -1748,6 +1754,8 @@ __update_group_shares_cpu(struct task_group *tg, struct sched_domain *sd,
> 
>   	if (shares < MIN_SHARES)
>   		shares = MIN_SHARES;
> +	else if (shares > MAX_SHARES)
> +		shares = MAX_SHARES;
> 
>   	__set_se_shares(tg->se[tcpu], shares);
>   }
> @@ -8722,13 +8730,10 @@ int sched_group_set_shares(struct task_group *tg, unsigned long shares)
>   	if (!tg->se[0])
>   		return -EINVAL;
> 
> -	/*
> -	 * A weight of 0 or 1 can cause arithmetics problems.
> -	 * (The default weight is 1024 - so there's no practical
> -	 *  limitation from this.)
> -	 */
>   	if (shares < MIN_SHARES)
>   		shares = MIN_SHARES;
> +	else if (shares > MAX_SHARES)
> +		shares = MAX_SHARES;
> 
>   	mutex_lock(&shares_mutex);
>   	if (tg->shares == shares)
> @@ -8753,7 +8758,7 @@ int sched_group_set_shares(struct task_group *tg, unsigned long shares)
>   		 * force a rebalance
>   		 */
>   		cfs_rq_set_shares(tg->cfs_rq[i], 0);
> -		set_se_shares(tg->se[i], shares/nr_cpu_ids);
> +		set_se_shares(tg->se[i], shares);
>   	}
> 
>   	/*


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] sched: fair-group: fix a Div0 error of the fair group scheduler
  2008-04-28  4:54 [PATCH] sched: fair-group: fix a Div0 error of the fair group scheduler Miao Xie
  2008-04-28  5:45 ` Peter Zijlstra
  2008-04-28  8:34 ` Peter Zijlstra
@ 2008-04-28 12:51 ` Ingo Molnar
  2 siblings, 0 replies; 6+ messages in thread
From: Ingo Molnar @ 2008-04-28 12:51 UTC (permalink / raw)
  To: Miao Xie; +Cc: Peter Zijlstra, Linux-Kernel


* Miao Xie <miaox@cn.fujitsu.com> wrote:

> When I echoed 0 into the "cpu.shares" file, a Div0 error occured.
>
> We found it is caused by the following calling.
>
>   sched_group_set_shares(tg, shares)
>       set_se_shares(tg->se[i], shares/nr_cpu_ids)
>           __set_se_shares(se, shares)
>               div64_64((1ULL<<32), shares)
>
> When the echoed value was less than the number of processores, the 
> result of the sentence "shares/nr_cpu_ids" was 0, and then the system 
> called div64() to divide the result, the Div0 error occured.
>
> It is unnecessary that the shares value is divided by nr_cpu_ids, I 
> think. Because in the function __update_group_shares_cpu() and 
> init_tg_cfs_entry(), the shares value isn't divided by nr_cpu_ids when 
> setting shares of the sched entity.
>
> This patch fixes this bug. And echoing ULONG_MAX value into cpu.shares 
> also causes Div0 error, so we set a macro MAX_SHARES to limit the max 
> value of shares.

thanks, applied.

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-04-28 12:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-28  4:54 [PATCH] sched: fair-group: fix a Div0 error of the fair group scheduler Miao Xie
2008-04-28  5:45 ` Peter Zijlstra
2008-04-28  8:27   ` Miao Xie
2008-04-28  8:33     ` Peter Zijlstra
2008-04-28  8:34 ` Peter Zijlstra
2008-04-28 12:51 ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox