From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 11525388393 for ; Fri, 15 May 2026 19:26:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778873222; cv=none; b=sWa+BKFowKyXa8l9HbixONAyGfm8siZ7zFKrfADgcaGaydOOoKsW+YSJzl80wQuOYFV3jDQizgzUk9py1eVoDeBG0tYu6EbiS/uaMRVkHnuWmdO/ffGou09YW1rhiC7R/xQsKJBMU0p44apGfYRSjDucdnuemwFpA276tXRb7HE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778873222; c=relaxed/simple; bh=uViinW9C3kW4xwNqdc42QpAvMaCOczmI7bRxzglxGU4=; h=Message-ID:Subject:From:To:Cc:Date:In-Reply-To:References: Content-Type:MIME-Version; b=fqA586nxt1OMR2iWKReiM+rv1hWbb6F810HdcX7X6kmTiOZZmqKLPKsBatNZFQVfepJ6xYDUXa8F+h9ZFvqK7+7QZ2U07RV4PWGSQTSMUJL7AGy5nbAY7f2XcgBJJAamxzx+ibzD8/MMshSvXeAqAVReNVFDPTL17uo9XhBta94= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ULHSodMU; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ULHSodMU" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1778873220; x=1810409220; h=message-id:subject:from:to:cc:date:in-reply-to: references:content-transfer-encoding:mime-version; bh=uViinW9C3kW4xwNqdc42QpAvMaCOczmI7bRxzglxGU4=; b=ULHSodMUk58xAStfBqm1N/12wz1PK+0IZi9fLfalNvl7m1zrdDxDZb2p sFmwGY9UYLOC7l+qm9ZUZoAuFvGPMJZ6B3/t5UuK20YfIXufWbiPT+1L1 thcuubHULWA4TGK6Xdn85TwNWGGV31pt/1bn9FCokSsQour9r1Hz8AR4D XQ+Fu7VBT8B7LjoESsUOgKbtTEwZg6Qo1K3Rk0hz6d+7wuFeSmCmF9TKV SOgW03zQP0P39BufLyrAiJcYyqNZYoptAHUzaI5BOWKuwORD3UelhIjYX y7zaPwbU8xo/5gkI4mRALs3Z2SqrW4eEDRf9RfKJ4YZ3s2hhzVUVzcDVf w==; X-CSE-ConnectionGUID: joNldD/9T9yy6WCVTopc2w== X-CSE-MsgGUID: fX+++wgCQnGZmeP6VrUZPg== X-IronPort-AV: E=McAfee;i="6800,10657,11787"; a="78859351" X-IronPort-AV: E=Sophos;i="6.23,236,1770624000"; d="scan'208";a="78859351" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2026 12:26:59 -0700 X-CSE-ConnectionGUID: wawQhmvnRYykKbxQgSv89g== X-CSE-MsgGUID: 8nCqUwAcTsO5OYPlsDmRLQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,236,1770624000"; d="scan'208";a="232380331" Received: from schen9-mobl4.amr.corp.intel.com (HELO [10.125.108.218]) ([10.125.108.218]) by fmviesa009-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2026 12:26:58 -0700 Message-ID: Subject: Re: [PATCH v3 1/4] sched/fair: Check CPU capacity before comparing group types during load balance From: Tim Chen To: Ricardo Neri , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Chen Yu , Christian Loehle , Barry Song Cc: "Rafael J. Wysocki" , Len Brown , ricardo.neri@intel.com, linux-kernel@vger.kernel.org Date: Fri, 15 May 2026 12:26:57 -0700 In-Reply-To: <20260514-rneri-fix-cas-clusters-v3-1-0037869554bd@linux.intel.com> References: <20260514-rneri-fix-cas-clusters-v3-0-0037869554bd@linux.intel.com> <20260514-rneri-fix-cas-clusters-v3-1-0037869554bd@linux.intel.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.58.1 (3.58.1-1.fc43) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 On Thu, 2026-05-14 at 11:34 -0700, Ricardo Neri wrote: > update_sd_pick_busiest() may incorrectly select a fully_busy group as the > busiest group when its per-CPU capacity exceeds that of the destination > CPU. This happens because the type of busiest group is initialized to > group_has_spare and allows the fully_busy group to win the type compariso= n. >=20 > update_sd_pick_busiest() should not choose a candidate scheduling group > with at most one runnable task if its per-CPU capacity is greater than th= at > of the destination CPU. Such a check already exists, but it is done too > late: after the type comparison, preventing a subsequent fully_busy group > of equal per-CPU capacity from being correctly selected. >=20 > Move this check to occur before comparing group types. Looks good to me. Reviewed-by: Tim Chen >=20 > Fixes: 0b0695f2b34a ("sched/fair: Rework load_balance()") > Reviewed-by: Christian Loehle > Signed-off-by: Ricardo Neri > --- > Changes in v3: > * Added a Fixes tag. (Christian) > * Added Reviewed-by tag from Christian. Thanks! >=20 > Changes in v2: > * Added a note clarifying that SMT and SD_ASYM_CPUCAPACITY are mutually > exclusive. (Tim) > * Kept parentheses around bitwise operators for clarity. > * Rewrote patch description for clarity. > --- > kernel/sched/fair.c | 25 ++++++++++++++----------- > 1 file changed, 14 insertions(+), 11 deletions(-) >=20 > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 3ebec186f982..e06e74d9ce0e 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -10818,6 +10818,20 @@ static bool update_sd_pick_busiest(struct lb_env= *env, > sds->local_stat.group_type !=3D group_has_spare)) > return false; > =20 > + /* > + * Candidate sg has no more than one task per CPU and has higher > + * per-CPU capacity. Migrating tasks to less capable CPUs may harm > + * throughput. Maximize throughput, power/energy consequences are not > + * considered. > + * > + * Systems with SMT are unaffected, as asymmetric capacity is not set > + * in such cases. > + */ > + if ((env->sd->flags & SD_ASYM_CPUCAPACITY) && > + (sgs->group_type <=3D group_fully_busy) && > + (capacity_greater(sg->sgc->min_capacity, capacity_of(env->dst_cpu))= )) > + return false; > + > if (sgs->group_type > busiest->group_type) > return true; > =20 > @@ -10920,17 +10934,6 @@ static bool update_sd_pick_busiest(struct lb_env= *env, > break; > } > =20 > - /* > - * Candidate sg has no more than one task per CPU and has higher > - * per-CPU capacity. Migrating tasks to less capable CPUs may harm > - * throughput. Maximize throughput, power/energy consequences are not > - * considered. > - */ > - if ((env->sd->flags & SD_ASYM_CPUCAPACITY) && > - (sgs->group_type <=3D group_fully_busy) && > - (capacity_greater(sg->sgc->min_capacity, capacity_of(env->dst_cpu))= )) > - return false; > - > return true; > } > =20