From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933119Ab2LHMUn (ORCPT ); Sat, 8 Dec 2012 07:20:43 -0500 Received: from mga01.intel.com ([192.55.52.88]:26619 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933103Ab2LHMUm (ORCPT ); Sat, 8 Dec 2012 07:20:42 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.84,243,1355126400"; d="scan'208";a="259051022" Message-ID: <50C33095.9030702@intel.com> Date: Sat, 08 Dec 2012 20:20:37 +0800 From: Alex Shi User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120912 Thunderbird/15.0.1 MIME-Version: 1.0 To: Paul Turner CC: Alex Shi , Ingo Molnar , Peter Zijlstra , lkml , Vincent Guittot , Preeti U Murthy , Andrew Morton , Venkatesh Pallipadi , Tejun Heo Subject: Re: weakness of runnable load tracking? References: <50C00D41.1010800@intel.com> <50C0B579.3040602@intel.com> In-Reply-To: <50C0B579.3040602@intel.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 12/06/2012 11:10 PM, Alex Shi wrote: > >>> Hi Paul & Ingo: >>> >>> In a short word of this issue: burst forking/waking tasks have no time >>> accumulate the load contribute, their runnable load are taken as zero. >>> that make select_task_rq do a wrong decision on which group is idlest. >> >> So these aren't strictly comparable; bursting and forking tasks have >> fairly different characteristics here. > > Many thanks for looking into this. :) >> >> When we fork a task we intentionally reset the previous history. This >> means that a forked task that immediately runs is going to show up as >> 100% runnable and then converge to it's true value. This was fairly >> intentionally chosen so that tasks would "start" fast rather than >> having to worry about ramp up. > > I am sorry for didn't see the 100% runnable for a new forked task. > I believe the code need the following patch to initialize decay_count, > and load_avg_contrib. otherwise they are random value. > In enqueue_entity_load_avg() p->se.avg.runnable_avg_sum for new forked > task is always zero, either because se.avg.last_runnable_update is set > as clock_task due to decay_count <=0, or just do > __synchronize_entity_decay not update_entity_load_avg. Paul: Would you like to give some comments for the following patches? > > =========== > From a161000dbece6e95bf3b81e9246d51784589d393 Mon Sep 17 00:00:00 2001 > From: Alex Shi > Date: Mon, 3 Dec 2012 17:30:39 +0800 > Subject: [PATCH 05/12] sched: load tracking bug fix > > We need initialize the se.avg.{decay_count, load_avg_contrib} to zero > after a new task forked. > Otherwise random values of above variable give a incorrect statistic > data when do new task enqueue: > enqueue_task_fair > enqueue_entity > enqueue_entity_load_avg > > Signed-off-by: Alex Shi > --- > kernel/sched/core.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 5dae0d2..e6533e1 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -1534,6 +1534,8 @@ static void __sched_fork(struct task_struct *p) > #if defined(CONFIG_SMP) && defined(CONFIG_FAIR_GROUP_SCHED) > p->se.avg.runnable_avg_period = 0; > p->se.avg.runnable_avg_sum = 0; > + p->se.avg.decay_count = 0; > + p->se.avg.load_avg_contrib = 0; > #endif > #ifdef CONFIG_SCHEDSTATS > memset(&p->se.statistics, 0, sizeof(p->se.statistics)); > -- Thanks Alex