From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756124Ab1KWOs1 (ORCPT ); Wed, 23 Nov 2011 09:48:27 -0500 Received: from casper.infradead.org ([85.118.1.10]:54772 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752447Ab1KWOs0 convert rfc822-to-8bit (ORCPT ); Wed, 23 Nov 2011 09:48:26 -0500 Message-ID: <1322059696.14799.68.camel@twins> Subject: Re: [patch 3/7] sched: set skip_clock_update in yield_task_fair() From: Peter Zijlstra To: Mike Galbraith Cc: Suresh Siddha , linux-kernel , Ingo Molnar , Paul Turner Date: Wed, 23 Nov 2011 15:48:16 +0100 In-Reply-To: <1321971686.6855.18.camel@marge.simson.net> References: <1321350377.1421.55.camel@twins> <1321406062.16760.60.camel@sbsiddha-desk.sc.intel.com> <1321435455.5072.64.camel@marge.simson.net> <1321468646.11680.2.camel@sbsiddha-desk.sc.intel.com> <1321495153.5100.7.camel@marge.simson.net> <1321544313.6308.25.camel@marge.simson.net> <1321545376.2495.1.camel@laptop> <1321547917.6308.48.camel@marge.simson.net> <1321551381.15339.21.camel@sbsiddha-desk.sc.intel.com> <1321629267.7080.13.camel@marge.simson.net> <1321629441.7080.15.camel@marge.simson.net> <1321630552.2197.15.camel@twins> <1321971445.6855.14.camel@marge.simson.net> <1321971686.6855.18.camel@marge.simson.net> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT X-Mailer: Evolution 3.2.1- Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2011-11-22 at 15:21 +0100, Mike Galbraith wrote: > This is another case where we are on our way to schedule(), > so can save a useless clock update and resulting microscopic > vruntime update. > Everytime I see one of these skip_clock_update thingies I get the idea we should do something smarter, because its most fragile and getting it wrong results in a subtle trainwreck.. Would something like the below help in validating this stuff? --- Index: linux-2.6/kernel/sched/core.c =================================================================== --- linux-2.6.orig/kernel/sched/core.c +++ linux-2.6/kernel/sched/core.c @@ -113,12 +113,27 @@ void update_rq_clock(struct rq *rq) { s64 delta; - if (rq->skip_clock_update > 0) + if (rq->skip_clock_update > 0) { + /* + * We skipped an update even though we had a full context + * switch in between, badness. + */ + if (rq->clock_update_stamp != rq->sched_switch) + trace_printk("lost an update!!\n"); return; + } + + /* + * We're updating the clock even though we didn't switch yet. + */ + if (rq->clock_update_stamp == rq->sched_switch) + trace_printk("too many updates!!\n"); delta = sched_clock_cpu(cpu_of(rq)) - rq->clock; rq->clock += delta; update_rq_clock_task(rq, delta); + + rq->clock_update_stamp = rq->sched_switch } /* @@ -1922,6 +1937,9 @@ static void finish_task_switch(struct rq #ifdef __ARCH_WANT_INTERRUPTS_ON_CTXSW local_irq_enable(); #endif /* __ARCH_WANT_INTERRUPTS_ON_CTXSW */ +#ifdef CONFIG_SCHED_DEBUG + schedstat_inc(rq, sched_switch); +#endif finish_lock_switch(rq, prev); fire_sched_in_preempt_notifiers(current); Index: linux-2.6/kernel/sched/sched.h =================================================================== --- linux-2.6.orig/kernel/sched/sched.h +++ linux-2.6/kernel/sched/sched.h @@ -374,6 +374,8 @@ struct rq { unsigned char nohz_balance_kick; #endif int skip_clock_update; + unsigned int clock_update_stamp; + /* capture load from *all* tasks on this cpu: */ struct load_weight load;