From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933112AbcBDScV (ORCPT ); Thu, 4 Feb 2016 13:32:21 -0500 Received: from foss.arm.com ([217.140.101.70]:43910 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932556AbcBDScS (ORCPT ); Thu, 4 Feb 2016 13:32:18 -0500 Date: Thu, 4 Feb 2016 18:32:59 +0000 From: Juri Lelli To: Steven Rostedt Cc: Peter Zijlstra , Ingo Molnar , LKML , Clark Williams , John Kacur , Daniel Bristot de Oliveira , Juri Lelli Subject: Re: [BUG] Corrupted SCHED_DEADLINE bandwidth with cpusets Message-ID: <20160204183259.GF29586@e106622-lin> References: <20160203135550.5f95ecb2@gandalf.local.home> <20160204095448.GE12132@e106622-lin> <20160204120412.GA29586@e106622-lin> <20160204122745.GC29586@e106622-lin> <20160204163049.GE29586@e106622-lin> <20160204123103.058642ed@gandalf.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160204123103.058642ed@gandalf.local.home> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/02/16 12:31, Steven Rostedt wrote: > On Thu, 4 Feb 2016 16:30:49 +0000 > Juri Lelli wrote: > > > I've actually changed a bit this approach, and things seem better here. > > Could you please give this a try? (You can also fetch the same branch). > > It appears to fix the one issue I pointed out, but it doesn't fix the > issue with cpusets. > > # burn& > # TASK=$! > # schedtool -E -t 2000000:20000000 $TASK > # grep dl /proc/sched_debug > dl_rq[0]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 104857 > dl_rq[1]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 104857 > dl_rq[2]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 104857 > dl_rq[3]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 104857 > dl_rq[4]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 104857 > dl_rq[5]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 104857 > dl_rq[6]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 104857 > dl_rq[7]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 104857 > > # mkdir /sys/fs/cgroup/cpuset/my_cpuset > # echo 1 > /sys/fs/cgroup/cpuset/my_cpuset/cpuset.cpus > # grep dl /proc/sched_debug > dl_rq[0]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 209714 > dl_rq[1]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 209714 > dl_rq[2]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 209714 > dl_rq[3]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 209714 > dl_rq[4]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 209714 > dl_rq[5]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 209714 > dl_rq[6]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 209714 > dl_rq[7]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : 209714 > > It appears to add double the bandwidth. > Mmm.. IIUC that's because we don't destroy any root_domain in this case, as sched_load_balance of the parent is still set. So we add again to the existing one. I could fix that with some flag indicating when we actually destroy root_domain(s), but I fear it will make this solution uglier than it is already :/. More thinking required. Thanks for testing. Best, - Juri