From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755532Ab0JNJ1y (ORCPT ); Thu, 14 Oct 2010 05:27:54 -0400 Received: from casper.infradead.org ([85.118.1.10]:44307 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755037Ab0JNJ1w convert rfc822-to-8bit (ORCPT ); Thu, 14 Oct 2010 05:27:52 -0400 Subject: Re: [PATCH v3 2/7] sched: accumulate per-cfs_rq cpu usage From: Peter Zijlstra To: Paul Turner Cc: bharata@linux.vnet.ibm.com, linux-kernel@vger.kernel.org, Dhaval Giani , Balbir Singh , Vaidyanathan Srinivasan , Srivatsa Vaddagiri , Kamalesh Babulal , Ingo Molnar , Pavel Emelyanov , Herbert Poetzl , Avi Kivity , Chris Friesen , Paul Menage , Mike Waychison , Nikhil Rao In-Reply-To: References: <20101012074910.GA9893@in.ibm.com> <20101012075109.GC9893@in.ibm.com> <1287046900.29097.161.camel@twins> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Date: Thu, 14 Oct 2010 11:27:08 +0200 Message-ID: <1287048428.29097.174.camel@twins> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2010-10-14 at 02:14 -0700, Paul Turner wrote: > On Thu, Oct 14, 2010 at 2:01 AM, Peter Zijlstra wrote: > > On Tue, 2010-10-12 at 13:21 +0530, Bharata B Rao wrote: > >> +static void account_cfs_rq_quota(struct cfs_rq *cfs_rq, > >> + unsigned long delta_exec) > >> +{ > >> + if (cfs_rq->quota_assigned == RUNTIME_INF) > >> + return; > >> + > >> + cfs_rq->quota_used += delta_exec; > >> + > >> + if (cfs_rq->quota_used < cfs_rq->quota_assigned) > >> + return; > >> + > >> + cfs_rq->quota_assigned += tg_request_cfs_quota(cfs_rq->tg); > >> +} > > > > That looks iffy, quota_assigned is only ever incremented and can wrap. > > This can't advance at a rate faster than ~vruntime and we can't handle > wrapping there anyway (fortunately it would take something like 35k > years?) You can't go faster than wall-time, vruntime can actually go a lot faster and can deal with wrapping. > > Why not subtract delta_exec and replenish when <0? That keeps the > > numbers small. > > > > Accounting in the opposite direction allows us to catch-up in > subsequent periods when a task exceeds its bandwidth across an > interval where we are not able to immediately throttle it (e.g. costly > syscall without config_prempt). Since we'll continue to accrue the > execution time in this case it will be effectively pre-charged against > the next slice received. Humm, how so, that's a simply matter of the quota going negative, right?