From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from fra-out-013.esa.eu-central-1.outbound.mail-perimeter.amazon.com (fra-out-013.esa.eu-central-1.outbound.mail-perimeter.amazon.com [63.178.132.221])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4FE5030DD35
	for <linux-kernel@vger.kernel.org>; Mon,  1 Dec 2025 12:59:44 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=63.178.132.221
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1764593987; cv=none; b=np/awhRz/eNCxwS/DMmyzXrZIk/xb32OUWI9XKZ8ZbcM3VSpmRuejrLa0eX0s7rZRGEs8w2/VA20g0OZsj4C7FW6Bz0tpfPIgTWCOHlesmrYs1VnDVkA2LOMVBho0StfDiSBiMaKKHZkJqWayPjuwKNytBlziMd0yhlKvacaTFg=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1764593987; c=relaxed/simple;
	bh=p9BpLfzbImE5wZDrXzDut1b1hRuWNkGaetip8iuAEUE=;
	h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References:
	 MIME-Version:Content-Type; b=PCWcktn9fzBtKL75BNBiQh3iP3CAbp5ZO0zDEHWfE08RJsMbD4eSJyIbD+HtdFd0J4v41i84yqHXOm8aMk/DdLBc0SaisHTEsInKTH9nD4YnnvjTBAFsUkFMMZL6zzIr5647YYBm2V7SlMdhxZjOF7ZfCPs2zT87sZHeUn2XJbE=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.com; dkim=pass (2048-bit key) header.d=amazon.com header.i=@amazon.com header.b=WwjAz1j8; arc=none smtp.client-ip=63.178.132.221
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com
Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=amazon.com header.i=@amazon.com header.b="WwjAz1j8"
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
  d=amazon.com; i=@amazon.com; q=dns/txt; s=amazoncorp2;
  t=1764593985; x=1796129985;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=1j6/SO2acgI5jMAUJEf68Gx6oInjoOosjNCSHlg93Io=;
  b=WwjAz1j8a7ZNtE5MoSwO3esACgyGVU1d6hEI5NuLdmb9SOdAm5jdfV3F
   rhBHRF0E/7y86N8ZrfZf8QS2Dguvpr7GViKOgL72UnyGO3UnM62SI8e+g
   oDgCaHIzflTBtaHsPluvgl2KmyzLjKsRj7HPX0drl19gAu42Fpb6TSDLI
   Y/4ih8FLAb1mTPeMp+SRwZj3bAniE3OcJhxnGr3uULztXSpnahtLFTfnC
   WU9Wyb5LJBzSGrnztme1ELhhQJEAILW+xVz/hDyLbKLZ9ZM7nW0viGlMq
   VvbKoG0FoFEn69wnLTIXGbjeK8WP2VA3TDuMAtrdMrqoXnA5JY8vIPPMC
   g==;
X-CSE-ConnectionGUID: hZ4cXmQuSHKXS15EeQDWsQ==
X-CSE-MsgGUID: TqEii3D8QMusE2z+/TT0sw==
X-IronPort-AV: E=Sophos;i="6.20,240,1758585600"; 
   d="scan'208";a="5948930"
Received: from ip-10-6-6-97.eu-central-1.compute.internal (HELO smtpout.naws.eu-central-1.prod.farcaster.email.amazon.dev) ([10.6.6.97])
  by internal-fra-out-013.esa.eu-central-1.outbound.mail-perimeter.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Dec 2025 12:59:24 +0000
Received: from EX19MTAEUC002.ant.amazon.com [54.240.197.228:5191]
 by smtpin.naws.eu-central-1.prod.farcaster.email.amazon.dev [10.0.10.60:2525] with esmtp (Farcaster)
 id 0117bd73-594e-49dd-a864-e8a0e43a201d; Mon, 1 Dec 2025 12:59:24 +0000 (UTC)
X-Farcaster-Flow-ID: 0117bd73-594e-49dd-a864-e8a0e43a201d
Received: from EX19D003EUB001.ant.amazon.com (10.252.51.97) by
 EX19MTAEUC002.ant.amazon.com (10.252.51.245) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.2562.29;
 Mon, 1 Dec 2025 12:59:23 +0000
Received: from u5934974a1cdd59.ant.amazon.com (10.146.13.109) by
 EX19D003EUB001.ant.amazon.com (10.252.51.97) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.2562.29;
 Mon, 1 Dec 2025 12:59:15 +0000
From: Fernand Sieber <sieberf@amazon.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
CC: Peter Zijlstra <peterz@infradead.org>, <mingo@redhat.com>,
	<linux-kernel@vger.kernel.org>, <juri.lelli@redhat.com>,
	<dietmar.eggemann@arm.com>, <rostedt@goodmis.org>, <bsegall@google.com>,
	<mgorman@suse.de>, <vschneid@redhat.com>, <kprateek.nayak@amd.com>,
	<dwmw@amazon.co.uk>, <jschoenh@amazon.de>, <liuyuxua@amazon.com>,
	<abusse@amazon.com>, <gmazz@amazon.com>, <rkagan@amazon.com>
Subject: Re: [PATCH] sched/fair: Force idle aware load balancing
Date: Mon, 1 Dec 2025 14:58:49 +0200
Message-ID: <20251201125851.272237-1-sieberf@amazon.com>
X-Mailer: git-send-email 2.43.0
In-Reply-To: <CAKfTPtBkdp2GcK9yLR3_rJrW7AH_hFpjfMjMFQKCngLbXHBPRw@mail.gmail.com>
References: <20251127202719.963766-1-sieberf@amazon.com> <20251128111427.GJ3245006@noisy.programming.kicks-ass.net> <CAKfTPtBkdp2GcK9yLR3_rJrW7AH_hFpjfMjMFQKCngLbXHBPRw@mail.gmail.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
X-ClientProxiedBy: EX19D032UWA004.ant.amazon.com (10.13.139.56) To
 EX19D003EUB001.ant.amazon.com (10.252.51.97)
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit

On Fri, 28 Nov 2025 at 14:50, Vincent Guittot <vincent.guittot@linaro.org> wrote:
> On Fri, 28 Nov 2025 at 12:14, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Nov 27, 2025 at 10:27:17PM +0200, Fernand Sieber wrote:
> >
> > > @@ -11123,7 +11136,8 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
> > >               return;
> > >       }
> > >
> > > -     if (busiest->group_type == group_smt_balance) {
> > > +     if (busiest->group_type == group_smt_balance ||
> > > +         busiest->forceidle_weight) {
> >
> > Should we not instead make it so that we select group_smt_balance in
> > this case?
>
> Why do we need this test ? We have already removed forced idle cpus
> from statistics ?
>
> I suppose Fernand wants to cover cases where there is 1 task per CPUs
> so we are balanced but one CPU is forced idle and we want to force
> migrating a task to then try to move back another one ? In this case
> it should be detected early and become group_imbalanced type
> Also what happens if we could migrate more than one task

I've removed this override in v2, it doesn't seem to make much a
difference after doing more benchmarking.

When I traced LB inefficiencies, I noticed in some situations that a
large imbalance (overloaded vs spare capacity) was detected, but
remediation was delayed. So the intention of the override was to "nudge"
the LB to take a remediation action immediately, regardless of the load
to move, with the idea that it's better to migrate anything now rather
than waste capacity in force idle for longer.

This override was probably not the right tool for it. If I get a chance
I'll try to dive deeper and provide more details.

One different thing I noticed is that the task_hot check has a cookie
check which is more or less bound to fail on busy large system running
lots of different cookied tasks (e.g hypervisor on large servers with
cookied time shared vCPUs) because there's almost zero chance that the
target CPU is randomly running the same cookie as the migrating task.
This delays migrations unnecessarily if the run queues are shorts and
there are no valid spare candidates. Need to think more about that one,
but if you have any ideas let me know.. ? Maybe instead of having this
check the list of migrating tasks should be sorted to prioritize
matching cookie tasks first if any, similar than proposed in the cache
aware scheduling RFC?
https://lwn.net/ml/all/26e7bfa88163e13ba1ebefbb54ecf5f42d84f884.1760206683.git.tim.c.chen@linux.intel.com/


Amazon Development Centre (South Africa) (Proprietary) Limited
29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa
Registration Number: 2004 / 034463 / 07