From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752093AbcBKMka (ORCPT ); Thu, 11 Feb 2016 07:40:30 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:34631 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753124AbcBKMkZ (ORCPT ); Thu, 11 Feb 2016 07:40:25 -0500 Date: Thu, 11 Feb 2016 13:40:18 +0100 From: luca abeni To: Juri Lelli Cc: Steven Rostedt , linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, vincent.guittot@linaro.org, wanpeng.li@hotmail.com Subject: Re: [PATCH 1/2] sched/deadline: add per rq tracking of admitted bandwidth Message-ID: <20160211134018.6b15fd68@utopia> In-Reply-To: <20160211122754.GN11415@e106622-lin> References: <1454935531-7541-1-git-send-email-juri.lelli@arm.com> <1454935531-7541-2-git-send-email-juri.lelli@arm.com> <20160210113258.GX11415@e106622-lin> <20160210093702.10c655be@gandalf.local.home> <20160210162748.GI11415@e106622-lin> <20160211121257.GL11415@e106622-lin> <20160211132254.1a369fe9@utopia> <20160211122754.GN11415@e106622-lin> Organization: university of trento X-Mailer: Claws Mail 3.12.0 (GTK+ 2.24.28; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 11 Feb 2016 12:27:54 +0000 Juri Lelli wrote: > On 11/02/16 13:22, Luca Abeni wrote: > > Hi Juri, > > > > On Thu, 11 Feb 2016 12:12:57 +0000 > > Juri Lelli wrote: > > [...] > > > I think we still have (at least) two problems: > > > > > > - select_task_rq_dl, if we select a different target > > > - select_task_rq might make use of select_fallback_rq, if > > > cpus_allowed changed after the task went to sleep > > > > > > Second case is what creates the problem here, as we don't update > > > task_rq(p) and fallback_cpu ac_bw. I was thinking we might do so, > > > maybe adding fallback_cpu in task_struct, from > > > migrate_task_rq_dl() (it has to be added yes), but I fear that we > > > should hold both rq locks :/. > > > > > > Luca, did you already face this problem (if I got it right) and > > > thought of a way to fix it? I'll go back and stare a bit more at > > > those paths. > > In my patch I took care of the first case (modifying > > select_task_rq_dl() to move the utilization from the "old rq" to the > > "new rq"), but I never managed to trigger select_fallback_rq() in my > > tests, so I overlooked that case. > > > > Right, I was thinking to do the same. And you did that after grabbing > both locks, right? Not sure if I did everything correctly, but my code in select_task_rq_dl() currently looks like this (you can obviously ignore the "migrate_active" and "*_running_bw()" parts, and focus on the "*_rq_bw()" stuff): [...] if (rq != cpu_rq(cpu)) { int migrate_active; raw_spin_lock(&rq->lock); migrate_active = hrtimer_active(&p->dl.inactive_timer); if (migrate_active) { hrtimer_try_to_cancel(&p->dl.inactive_timer); sub_running_bw(&p->dl, &rq->dl); } sub_rq_bw(&p->dl, &rq->dl); raw_spin_unlock(&rq->lock); rq = cpu_rq(cpu); raw_spin_lock(&rq->lock); add_rq_bw(&p->dl, &rq->dl); if (migrate_active) add_running_bw(&p->dl, &rq->dl); raw_spin_unlock(&rq->lock); } [...] lockdep is not screaming, and I am not able to trigger any race condition or strange behaviour (I am currently at more than 24h of continuous stress-testing, but maybe my testcase is not so good in finding races here :) Luca