From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1E028334C33
	for <linux-kernel@vger.kernel.org>; Fri,  5 Dec 2025 13:26:57 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.52
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1764941222; cv=none; b=suBq25wXGsr8KXBrr9f4ovepcihcDVC1AIvt60B2GU+iRU/fmFaZ8apm7tr2EnQlgeppRwTT3UaAeSoOCKgSNbCgCgUzXOHpTdgje4zns3ORymHBQQ/CqjpnfsGsxeI/0eqgYmCP2pxgIN3pikFxzZIEjl2rzQvINJ4c006hcyI=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1764941222; c=relaxed/simple;
	bh=FrV+Zo5QZ0q/+E6Zoxb6BrVPFnYfnUZZMnX6ak5TS84=;
	h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:
	 Content-Type:Content-Disposition:In-Reply-To; b=gazUT+B9MtyKx7KvKZhaKmf1K8uVqgZLEO85ZvwnBxUidsarK8uK7rhG7DFjP9Gyf8ENkVAzBA1DdHmJ3SdUnDfHT2sLdRpu/9gz0vDPROpLitOhNEoGNW8KBlIqsNs7e8/Jg/TztfozaRTJ0IuuMW5I0zdQb9mI+BYLYXyaN/s=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org; spf=pass smtp.mailfrom=linaro.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b=R8X1F0JI; arc=none smtp.client-ip=209.85.128.52
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org
Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linaro.org
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="R8X1F0JI"
Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-477aa218f20so15510545e9.0
        for <linux-kernel@vger.kernel.org>; Fri, 05 Dec 2025 05:26:57 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=linaro.org; s=google; t=1764941215; x=1765546015; darn=vger.kernel.org;
        h=in-reply-to:content-transfer-encoding:content-disposition
         :mime-version:references:message-id:subject:cc:to:from:date:from:to
         :cc:subject:date:message-id:reply-to;
        bh=p8m77JDWVd9tj5+mmTnbvheiOrcOgsNFYWTCftVGG4o=;
        b=R8X1F0JIovWIbeJSEYxuNO64E98JqwpTfRbwqJ4kAkEYwzT9TUldi2m05ALR2bB2eo
         wIHaUaFHpiD/0Jjo+Cm3dhxJ6aPZTYc1J/NfLimANakfHUApbVam8Y+o65+V9LgNG7A4
         jnUebwa09M4vW6X/gYHhG0E0rXwWg6e2V2e3bLVGIyWA/vSNTDRqr79/yOpC6pRfRFrz
         l9E6BU/MYdV3BInwHYTBRHlb/onNC9VoYQggqHg9h3tziuh8vG9lB4hVfzyOlUXJtWF8
         O9G5c2YNPrwrzFkWv/zpM8FtU8XYHfNfIKc1gt/6vTCVPQME1jeQoY001+vHfDyYI19j
         YNRQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1764941215; x=1765546015;
        h=in-reply-to:content-transfer-encoding:content-disposition
         :mime-version:references:message-id:subject:cc:to:from:date:x-gm-gg
         :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=p8m77JDWVd9tj5+mmTnbvheiOrcOgsNFYWTCftVGG4o=;
        b=sd3cLlWJsAYOe1cXGoMAbb/6uPLuEeJjDbIEf6m2kW4QGKj1dy4dnWxiDRxExag680
         DoHzCw69MUPqZU3oZSvIW4AoSDlOuxQT+us7pmGRK6p4LTVUX6D846yAEkitacDeW/0W
         Lkc8HZTn3smxIZdlkg21nh6k4KbnoSVrinNrXwyVAjla3oWfAi5kFk2LWyVBxUdcPFHk
         F9TrOKALmUI1H7xYujue4uWSgKDkjdt2fIcHtsLy7eGdXNffzfYyPwFNwPoX4T0wz7+H
         Lac544m/gfFMr+DUDxyoFfmt5R0ydUYafV8g8QAOxLu0YaA5iRnnGzeuBOddweewtCFG
         1smw==
X-Forwarded-Encrypted: i=1; AJvYcCUQFLBxqGvNl6Lf5DoXH8NVya/vZIilUihqdcYi2VIABofvoq2oWSIhGZj4voYeblNcfX5yW8NVcxi+f3c=@vger.kernel.org
X-Gm-Message-State: AOJu0YwfsfV+3fQaxHSvAn8683bJi2mV7zbK4KxORebL+KFPU9joOiPL
	3R+qNexN46kyClp1o9tONtaNyC3zApr7iN71H2T2bit+1vB1ePX9lM00GVQ6AmPeiCI=
X-Gm-Gg: ASbGncsFBBXmbVEmYWxpMaxFZikkWLlqZFdohDrupHYGs6jgpektW7GxG5rDM8LxPqF
	LofFKrweqmcjxlP5LtY5DZgOrP+x1imy6+rtOwP0nL8WG4BOX9IGEeUvewX22YjCpCt2U+V8N+X
	uknwPuNErfR2purC95kwC5zmTEfuJZPfdPJy56vvqjW7VvvI8GcHmizV1XnAaNxxGDxwDBU2CTr
	bzcrWvcBwKINXHRLOYHdbko+FSoM4RG0+T/8//mjXb2eVXh0F86wE1789RXNSRjGW1XzEsjSyGf
	mqlYQCI6nqg831ZMWalRy1TxvNYRHOwbBZWPxcW8VRUO+OD0hlqawKL1SR685n6jeMkoxG3TUc6
	5WLTih1sh/QdFodAop49m8TvE4r0Z+N1IOGT/jh/Dc6fDLpoNefZ3yvIB1e3EHFDtosngQUdeJd
	JU32oPCEsJdnszB/Me/VQ3KYwfE7s=
X-Google-Smtp-Source: AGHT+IGiNyTBr5Twnhr/7Ct6/ebyU606eWA9wQordZlcK0qRoCV74vCRQRHNf2QqTQ0P+nbNkLi58w==
X-Received: by 2002:a05:600c:524d:b0:477:333a:f71f with SMTP id 5b1f17b1804b1-4792af30ea0mr94060395e9.17.1764941215338;
        Fri, 05 Dec 2025 05:26:55 -0800 (PST)
Received: from vingu-cube ([2a01:e0a:f:6020:eda0:7a2e:97bd:8641])
        by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-47930ca15adsm89064965e9.13.2025.12.05.05.26.54
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Fri, 05 Dec 2025 05:26:54 -0800 (PST)
Date: Fri, 5 Dec 2025 14:26:53 +0100
From: Vincent Guittot <vincent.guittot@linaro.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@redhat.com, juri.lelli@redhat.com, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	vschneid@redhat.com, linux-kernel@vger.kernel.org,
	pierre.gondois@arm.com, kprateek.nayak@amd.com, qyousef@layalina.io,
	hongyan.xia2@arm.com, christian.loehle@arm.com,
	luis.machado@arm.com
Subject: Re: [PATCH 4/6 v8] sched/fair: Add push task mechanism for fair
Message-ID: <aTLdncXYqNyF9Bqq@vingu-cube>
References: <20251202181242.1536213-1-vincent.guittot@linaro.org>
 <20251202181242.1536213-5-vincent.guittot@linaro.org>
 <20251204112947.GK2528459@noisy.programming.kicks-ass.net>
 <CAKfTPtBGuS=bNN1ePf23pE6rPKSHbtM18FXuGjyAAOFaBB5eAA@mail.gmail.com>
 <20251205085912.GQ2528459@noisy.programming.kicks-ass.net>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <20251205085912.GQ2528459@noisy.programming.kicks-ass.net>

Le vendredi 05 déc. 2025 à 09:59:12 (+0100), Peter Zijlstra a écrit :
> On Thu, Dec 04, 2025 at 03:34:15PM +0100, Vincent Guittot wrote:
> > On Thu, 4 Dec 2025 at 12:29, Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Tue, Dec 02, 2025 at 07:12:40PM +0100, Vincent Guittot wrote:
> > > > +/*
> > > > + * See if the non running fair tasks on this rq can be sent on other CPUs
> > > > + * that fits better with their profile.
> > > > + */
> > > > +static bool push_fair_task(struct rq *rq)
> > > > +{
> > > > +     struct task_struct *next_task;
> > > > +     int prev_cpu, new_cpu;
> > > > +     struct rq *new_rq;
> > > > +
> > > > +     next_task = pick_next_pushable_fair_task(rq);
> > > > +     if (!next_task)
> > > > +             return false;
> > > > +
> > > > +     if (is_migration_disabled(next_task))
> > > > +             return true;
> > > > +
> > > > +     /* We might release rq lock */
> > > > +     get_task_struct(next_task);
> > > > +
> > > > +     prev_cpu = rq->cpu;
> > > > +
> > > > +     new_cpu = select_task_rq_fair(next_task, prev_cpu, 0);
> > > > +
> > > > +     if (new_cpu == prev_cpu)
> > > > +             goto out;
> > > > +
> > > > +     new_rq = cpu_rq(new_cpu);
> > > > +
> > > > +     if (double_lock_balance(rq, new_rq)) {
> > > > +             /* The task has already migrated in between */
> > > > +             if (task_cpu(next_task) != rq->cpu) {
> > > > +                     double_unlock_balance(rq, new_rq);
> > > > +                     goto out;
> > > > +             }
> > > > +
> > > > +             deactivate_task(rq, next_task, 0);
> > > > +             set_task_cpu(next_task, new_cpu);
> > > > +             activate_task(new_rq, next_task, 0);
> > > > +
> > > > +             resched_curr(new_rq);
> > > > +
> > > > +             double_unlock_balance(rq, new_rq);
> > > > +     }
> > >
> > > Why not use move_queued_task() ?
> > 
> > double_lock_balance() can fail and prevent being blocked waiting for
> > new rq whereas move_queued_task() will wait, won't it ?
> > 
> > Do you think move_queued_task() would be better ?
> 
> No, double_lock_balance() never fails, the return value indicates if the
> currently held rq-lock, (the first argument) was unlocked while
> attaining both -- this is required when the first rq is a higher address
> than the second.
> 
> double_lock_balance() also puts the wait-time and hold time of the
> second inside the hold time of the first, which gets you a quadric term
> in the rq hold times IIRC. Something that's best avoided.

yeah, I misread the return and my current code need to be fixed like:

---
 kernel/sched/fair.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index fbbe325dc633..35c7c968ddd2 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8629,19 +8629,18 @@ static bool push_fair_task(struct rq *rq)
 
 	if (double_lock_balance(rq, new_rq)) {
 		/* The task has already migrated in between */
-		if (task_cpu(next_task) != rq->cpu) {
-			double_unlock_balance(rq, new_rq);
-			goto out;
-		}
+		if (task_cpu(next_task) != rq->cpu)
+			goto unlock;
+	}
 
-		deactivate_task(rq, next_task, 0);
-		set_task_cpu(next_task, new_cpu);
-		activate_task(new_rq, next_task, 0);
+	deactivate_task(rq, next_task, DEQUEUE_NOCLOCK);
+	set_task_cpu(next_task, new_cpu);
+	activate_task(new_rq, next_task, 0);
 
-		resched_curr(new_rq);
+	wakeup_preempt(new_rq, next_task, 0);
 
-		double_unlock_balance(rq, new_rq);
-	}
+unlock:
+	double_unlock_balance(rq, new_rq);
 
 out:
 	put_task_struct(next_task);
-- 
2.43.0


> 
> move_queued_task() OTOH takes the task off the runqueue you already hold
> locked, drops this lock, acquires the second, puts the task there, and
> returns with the dst rq locked.

I supposed it's doable even if we don't have rq_flags
But we need the re-lock the current rq and release the new one to let the balance_callback
loop in the same state

> 
> > In case of migrate_disable, push_fair_task() returns true and we
> > continue with the next task (It should not have much anyway). If the
> > task is migrate_disabled when we try to push it, we remove it from the
> > list anyway. At now, we try to not have more than 1 task in the list
> > to cap the overhead on sched_switch
> 
> Right, clearly I needed more wake-up juice, I thought it returned false
> and would stick around.