From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757196Ab0JLKQ5 (ORCPT ); Tue, 12 Oct 2010 06:16:57 -0400 Received: from mx2.mail.elte.hu ([157.181.151.9]:45643 "EHLO mx2.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750867Ab0JLKQ4 (ORCPT ); Tue, 12 Oct 2010 06:16:56 -0400 Date: Tue, 12 Oct 2010 12:16:45 +0200 From: Ingo Molnar To: Con Kolivas Cc: William Pitcock , linux-kernel@vger.kernel.org, peterz@infradead.org, efault@gmx.de Subject: Re: [PATCH try 5] CFS: Add hierarchical tree-based penalty. Message-ID: <20101012101645.GA32486@elte.hu> References: <20101012093044.GD20366@elte.hu> <8358526.1721286876359420.JavaMail.root@ifrit.dereferenced.org> <20101012094735.GH20366@elte.hu> <201010122057.37272.kernel@kolivas.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201010122057.37272.kernel@kolivas.org> User-Agent: Mutt/1.5.20 (2009-08-17) X-ELTE-SpamScore: -2.0 X-ELTE-SpamLevel: X-ELTE-SpamCheck: no X-ELTE-SpamVersion: ELTE 2.0 X-ELTE-SpamCheck-Details: score=-2.0 required=5.9 tests=BAYES_00 autolearn=no SpamAssassin version=3.2.5 -2.0 BAYES_00 BODY: Bayesian spam probability is 0 to 1% [score: 0.0000] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Con Kolivas wrote: > On Tue, 12 Oct 2010 20:47:35 Ingo Molnar wrote: > > * William Pitcock wrote: > > > Hi, > > > > > > ----- "Ingo Molnar" wrote: > > > > * William Pitcock wrote: > > > > > Inspired by the recent change to BFS by Con Kolivas, this patch > > > > > > > > causes > > > > > > > > > vruntime to be penalized based on parent depth from their root task > > > > > > > > > > group. > > > > > > > > > > I have, for the moment, decided to make it a default feature since > > > > > > > > the > > > > > > > > > design of CFS ensures that broken applications depending on task > > > > > enqueue behaviour behaving traditionally will continue to work. > > > > > > > > Just curious, is this v5 submission a reply to Peter's earlier review > > > > of > > > > your v3 patch? If yes then please explicitly outline the changes you > > > > did > > > > so that Peter and others do not have to guess about the direction your > > > > > > > > work is taking. > > > > > > I just did that in the email I just sent. Simply put, I was talking > > > with Con a few weeks ago about the concept of having a maximum amount > > > of service for all threads belonging to a process. This did not work > > > out so well, so Con proposed penalizing based on fork depth, which > > > still allows us to maintain interactivity with make -j64 running in > > > the background. > > > > > > Actually, I lie: it works great for server scenarios where you have > > > some sysadmin also running azureus. Azureus gets penalized instead, > > > but other apps like audacious get penalized too. > > > > Thanks for the explanation! > > > > Ingo > > It's a fun feature I've been playing with that was going to make it into the > next -ck, albeit disabled by default. Here's what the patch changelog was > going to say: Find below the reply Peter sent to William's v5 patch. I suspect there will be a v6 to address those problems :) (William: please Cc: Con too to future updates of your patch.) Thanks, Ingo ----- Forwarded message from Peter Zijlstra ----- Date: Tue, 12 Oct 2010 11:46:57 +0200 From: Peter Zijlstra To: William Pitcock Cc: linux-kernel@vger.kernel.org, Ingo Molnar , Mike Galbraith Subject: Re: [PATCH try 3] CFS: Add hierarchical tree-based penalty. On Tue, 2010-10-12 at 13:34 +0400, William Pitcock wrote: > Yes, this should be a multiplication I believe, not a divide. My original > code had this as a multiplication, not a division, as does the new patch. > > However, I think: > > vruntime >>= tsk->fork_depth; > > would do the job just as well and be faster. That's still somewhat iffy as explained, vruntime is the absolute service level, multiplying that by 2 (or even more) will utterly upset things. Imagine two runnable tasks of weight 1, say both have a vruntime of 3 million, seconds (there being two, vruntime will advance at 1/2 wall-time). Now, suppose you wake a third, it too had a vruntime of around 3 million seconds (it only slept for a little while), if you then multiply that with 2 and place it at 6 mil, it will have to wait for 6 mil seconds before it gets serviced (twice the time of the 3 mil difference in service time between this new and the old tasks). So, theory says the fair thing to do is place new tasks at the weighted average of the existing tasks, but computing that is expensive, so what we do is place it somewhere near the leftmost task in the tree. Now, you don't want to push it out too far to the right, otherwise we get starvation issues and people get upset. So you have to somehow determine a window in which you want to place this task and then vary in that depending on your fork_depth. Simply manipulating the absolute service levels like you propose isn't going to work. ----- End forwarded message ----- -- Thanks, Ingo