From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752720AbcBKNFy (ORCPT ); Thu, 11 Feb 2016 08:05:54 -0500 Received: from mail-wm0-f45.google.com ([74.125.82.45]:37208 "EHLO mail-wm0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751099AbcBKNFw (ORCPT ); Thu, 11 Feb 2016 08:05:52 -0500 Date: Thu, 11 Feb 2016 14:05:45 +0100 From: luca abeni To: Juri Lelli Cc: Steven Rostedt , linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, vincent.guittot@linaro.org, wanpeng.li@hotmail.com Subject: Re: [PATCH 1/2] sched/deadline: add per rq tracking of admitted bandwidth Message-ID: <20160211140545.3c9e6e41@utopia> In-Reply-To: <20160211124959.GO11415@e106622-lin> References: <1454935531-7541-1-git-send-email-juri.lelli@arm.com> <1454935531-7541-2-git-send-email-juri.lelli@arm.com> <20160210113258.GX11415@e106622-lin> <20160210093702.10c655be@gandalf.local.home> <20160210162748.GI11415@e106622-lin> <20160211121257.GL11415@e106622-lin> <20160211132254.1a369fe9@utopia> <20160211122754.GN11415@e106622-lin> <20160211134018.6b15fd68@utopia> <20160211124959.GO11415@e106622-lin> Organization: university of trento X-Mailer: Claws Mail 3.12.0 (GTK+ 2.24.28; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 11 Feb 2016 12:49:59 +0000 Juri Lelli wrote: [...] > > > > > Luca, did you already face this problem (if I got it right) > > > > > and thought of a way to fix it? I'll go back and stare a bit > > > > > more at those paths. > > > > In my patch I took care of the first case (modifying > > > > select_task_rq_dl() to move the utilization from the "old rq" > > > > to the "new rq"), but I never managed to trigger > > > > select_fallback_rq() in my tests, so I overlooked that case. > > > > > > > > > > Right, I was thinking to do the same. And you did that after > > > grabbing both locks, right? > > > > Not sure if I did everything correctly, but my code in > > select_task_rq_dl() currently looks like this (you can obviously > > ignore the "migrate_active" and "*_running_bw()" parts, and focus on > > the "*_rq_bw()" stuff): > > [...] > > if (rq != cpu_rq(cpu)) { > > int migrate_active; > > > > raw_spin_lock(&rq->lock); > > migrate_active = > > hrtimer_active(&p->dl.inactive_timer); if (migrate_active) { > > hrtimer_try_to_cancel(&p->dl.inactive_timer); > > sub_running_bw(&p->dl, &rq->dl); > > } > > sub_rq_bw(&p->dl, &rq->dl); > > raw_spin_unlock(&rq->lock); > > rq = cpu_rq(cpu); > > Can't something happen here? My problem is that I use per-rq bw > tracking to save/restore root_domain state. So, I fear that a > root_domain update can happen while we are in the middle of moving bw > from one cpu to another. Well, I never used the rq utilization to re-build the root_domain utilization (and I never played with root domains too much... :)... So, I do not really know. Maybe the code should do: raw_spin_lock(&rq->lock); raw_spin_lock(&cpu_rq(cpu)->lock); sub_rq_bw(&p->dl, &rq->dl); add_rq_bw(&p->dl, &cpu_rq(cpu)->dl); [...] ? Luca