From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932487AbcEYRkS (ORCPT ); Wed, 25 May 2016 13:40:18 -0400 Received: from foss.arm.com ([217.140.101.70]:36465 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753001AbcEYRkO (ORCPT ); Wed, 25 May 2016 13:40:14 -0400 Subject: Re: [RFC PATCH] sched: fix hierarchical order in rq->leaf_cfs_rq_list To: Vincent Guittot , peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, pjt@google.com References: <1464080135-17112-1-git-send-email-vincent.guittot@linaro.org> <1464083710-4370-1-git-send-email-vincent.guittot@linaro.org> Cc: yuyang.du@intel.com, bsegall@google.com From: Dietmar Eggemann Message-ID: <5745E37B.4080804@arm.com> Date: Wed, 25 May 2016 18:40:11 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-Version: 1.0 In-Reply-To: <1464083710-4370-1-git-send-email-vincent.guittot@linaro.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Vincent, On 24/05/16 10:55, Vincent Guittot wrote: > Fix the insertion of cfs_rq in rq->leaf_cfs_rq_list to ensure that > a child will always called before its parent. > > The hierarchical order in shares update list has been introduced by > commit 67e86250f8ea ("sched: Introduce hierarchal order on shares update list") > > With the current implementation a child can be still put after its parent. > > Lets take the example of > root > \ > b > /\ > c d* > | > e* > > with root -> b -> c already enqueued but not d -> e so the leaf_cfs_rq_list > looks like: head -> c -> b -> root -> tail > > The branch d -> e will be added the first time that they are enqueued, > starting with e then d. > > When e is added, its parents is not already on the list so e is put at the > tail : head -> c -> b -> root -> e -> tail > > Then, d is added at the head because its parent is already on the list: > head -> d -> c -> b -> root -> e -> tail > > e is not placed at the right position and will be called the last whereas > it should be called at the beginning. > > Because it follows the bottom-up enqueue sequence, we are sure that we > will finished to add either a cfs_rq without parent or a cfs_rq with a parent > that is already on the list. We can use this event to detect when we have > finished to add a new branch. For the others, whose parents are not already > added, we have to ensure that they will be added after their children that > have just been inserted the steps before, and after any potential parents that > are already in the list. The easiest way is to put the cfs_rq just after the > last inserted one and to keep track of it untl the branch is fully added. > > Signed-off-by: Vincent Guittot So the use case for this would be you create a multi-level task group hierarchy on a cpu (e.g. tg_css_id=2,4,6 on cpu=1) and let a task run in the highest level task group (tg_css_id=6). list_add_leaf_cfs_rq() call: ... enqueue_task_fair: cpu=1 tg_css_id=6 cfs_rq=0xffffffc97500f700 on_list=0 enqueue_task_fair: cpu=1 tg_css_id=4 cfs_rq=0xffffffc975175900 on_list=0 enqueue_task_fair: cpu=1 tg_css_id=2 cfs_rq=0xffffffc975866d00 on_list=0 enqueue_task_fair: cpu=1 tg_css_id=1 cfs_rq=0xffffffc97fec44e8 on_list=1 ... In this case, the for_each_leaf_cfs_rq() in update_blocked_averages() iterates in the tg_css_id=2,1,6,4 order: ... update_blocked_averages: tg_css_id=2 cfs_rq=0xffffffc975866d00 on_list=1 update_blocked_averages: tg_css_id=1 cfs_rq=0xffffffc97fec44e8 on_list=1 update_blocked_averages: tg_css_id=6 cfs_rq=0xffffffc97500f700 on_list=1 update_blocked_averages: tg_css_id=4 cfs_rq=0xffffffc975175900 on_list=1 ... which I guess breaks the update_tg_cfs_util() call you introduced in update_blocked_averages() in '[RFC PATCH v2] sched: reflect sched_entity movement into task_group's utilization'. IMHO, otherwise update_blocked_averages() can deal with the list not being ordered tg_css_id=6,4,2,1. https://lkml.org/lkml/2016/5/24/200 The use of for_each_leaf_cfs_rq() in update_shares() is gone. Do the remaining call sites (update_runtime_enabled(), unthrottle_offline_cfs_rqs() require any ordering? [...]