From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758032AbYFLIuB (ORCPT ); Thu, 12 Jun 2008 04:50:01 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754716AbYFLItx (ORCPT ); Thu, 12 Jun 2008 04:49:53 -0400 Received: from bombadil.infradead.org ([18.85.46.34]:52336 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754300AbYFLItw (ORCPT ); Thu, 12 Jun 2008 04:49:52 -0400 Subject: Re: [PATCH 1/2] sched: fair group: fix overflow(was: fix divide by zero) From: Peter Zijlstra To: Lai Jiangshan Cc: mingo@elte.hu, Linux Kernel Mailing List In-Reply-To: <4850E192.3080804@cn.fujitsu.com> References: <484F7AC5.8040206@cn.fujitsu.com> <1213168547.31518.66.camel@twins> <4850E192.3080804@cn.fujitsu.com> Content-Type: text/plain Date: Thu, 12 Jun 2008 10:49:44 +0200 Message-Id: <1213260584.31518.86.camel@twins> Mime-Version: 1.0 X-Mailer: Evolution 2.22.1.1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2008-06-12 at 16:42 +0800, Lai Jiangshan wrote: > Peter Zijlstra wrote: > > > > I think the same thing to do is limit the shares value to something > > smaller instead of using an even more expensive divide. > > > > yes, you are right! > > I found another bug about "the shares value is too large": > > pid1 and pid2 are set affinity to cpu#0 > pid1 is attached to cg1 and pid2 is attached to cg2 > > if cg1/cpu.shares = 1024 cg2/cpu.shares = 2000000000 > then pid2 got 100% usage of cpu, and pid1 0% > > if cg1/cpu.shares = 1024 cg2/cpu.shares = 20000000000 > then pid2 got 0% usage of cpu, and pid1 100% > > > And a weight of a cfs_rq is the sum of weights of which entities > are queued on this cfs_rq, so the shares value should be limited > to a smaller value. Yeah, a lot of stuff will fall apart when weights grow too large, I think somewhere the load balance code also assumes weight * NICE_0_LOAD always fits in a long. So that immediately limits us to 32-10=22 bits (on 32bit systems). There might be other funnies.. > I think that (1UL << 18) is a good limited value: > 1)it's not too large, we can create a lot of group before overflow > 2)it's several times the weight value for nice=-19 (not too small) Thanks! Acked-by: Peter Zijlstra > Signed-off-by: Lai Jiangshan > --- > diff --git a/kernel/sched.c b/kernel/sched.c > index bfb8ad8..fe1b6c7 100644 > --- a/kernel/sched.c > +++ b/kernel/sched.c > @@ -312,12 +312,15 @@ static DEFINE_SPINLOCK(task_group_lock); > #endif > > /* > - * A weight of 0, 1 or ULONG_MAX can cause arithmetics problems. > + * A weight of 0 or 1 can cause arithmetics problems. > + * A weight of a cfs_rq is the sum of weights of which entities > + * are queued on this cfs_rq, so a weight of a entity should not be > + * too large, so as the shares value of a task group. > * (The default weight is 1024 - so there's no practical > * limitation from this.) > */ > #define MIN_SHARES 2 > -#define MAX_SHARES (ULONG_MAX - 1) > +#define MAX_SHARES (1UL << 18) > > static int init_task_group_load = INIT_TASK_GROUP_LOAD; > #endif > > > >