From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E81263E1CF1 for ; Thu, 14 May 2026 15:25:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.158.5 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778772349; cv=none; b=QDT3uOU32cpsQOqVOJsznqAjt9GmhJLjOYBt49P03N8UVqMcteONjUDiWFYRe++PdjRUqDzKqUxo4SCnEz0Uj99mPhnZMeb+AeCY6YreRipa23eqdw1ow6ki7kjcCjWiASe1wcKXaWLvDV9ercIw1BdnAfxAcoDsdrq+3D6u84E= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778772349; c=relaxed/simple; bh=TScG+KTOBUHTA2zs58xZghjf74TIWq0QVMJa6sBUH7Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EBHqc8FJLc1w9kP+iearIc2h/hXnN8UURds+49BGjdlzcm9wQZuhCqNypbqUn1SCGjdIioHFA2M+4WWtkudkSePr6nlHmFyvSxRa4Kc5iMD6CT4CYKWwjPEQbR8TfSsBX4VfTXW2XkY0hTSCEMrBhnftjS174bGzKnz/bytyIE4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=k1Ddnw/v; arc=none smtp.client-ip=148.163.158.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="k1Ddnw/v" Received: from pps.filterd (m0360072.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 64E8eXdO2760546; Thu, 14 May 2026 15:25:30 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=pp1; bh=oCn2K4cnc6079WVxp kq1oG6F1uEFEhOqsk0fvIAEdg0=; b=k1Ddnw/vpECpemQD3agt5BEcvAi2tys7n a+bJNSAV+qYAg8YuNqIYi+gHNn8klPzmLyMOGm9gW6kxjxfOfcq/GMRaE6nV29Eo sEjslgpGcC41OsKS4s82xGi/TNEGNg/d/+eqhl30nBx72olqGISKI1GjIxn45DN5 xlAaYriQHFGTGmGmyLKEu1KMhiL20vzG8UyuT6Lio4DnyW1krmo7OEWjeLA6XB9H vL6wfXuyu8YvpjdhF7Gf+T3E6q4OGi7+9F3RZi24AmIN0G/1/bsR3xp2k75Wwb3k hJebLYwk+FrQ01BJx27pe3G9IxYNxGiCsl4XINzUZIm7PYgjSJcfA== Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4e3nv6vjdj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 May 2026 15:25:29 +0000 (GMT) Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.18.1.7/8.18.1.7) with ESMTP id 64EF9Xvc003483; Thu, 14 May 2026 15:25:28 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 4e3nfgmxjc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 14 May 2026 15:25:28 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 64EFPOJs60752284 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 May 2026 15:25:24 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 909F720043; Thu, 14 May 2026 15:25:24 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A629B20040; Thu, 14 May 2026 15:25:15 +0000 (GMT) Received: from li-7bb28a4c-2dab-11b2-a85c-887b5c60d769.ibm.com.com (unknown [9.124.213.185]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 14 May 2026 15:25:15 +0000 (GMT) From: Shrikanth Hegde To: linux-kernel@vger.kernel.org, mingo@kernel.org, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, yury.norov@gmail.com, kprateek.nayak@amd.com, iii@linux.ibm.com Cc: sshegde@linux.ibm.com, tglx@kernel.org, gregkh@linuxfoundation.org, pbonzini@redhat.com, seanjc@google.com, vschneid@redhat.com, huschle@linux.ibm.com, rostedt@goodmis.org, dietmar.eggemann@arm.com, mgorman@suse.de, bsegall@google.com, maddy@linux.ibm.com, srikar@linux.ibm.com, hdanton@sina.com, chleroy@kernel.org, vineeth@bitbyteword.org, frederic@kernel.org, arighi@nvidia.com, pauld@redhat.com, christian.loehle@arm.com, tj@kernel.org, tommaso.cucinotta@gmail.com, maz@kernel.org, rafael@kernel.org Subject: [PATCH v3 17/20] sched/core: Introduce default arch handling code for inc/dec preferred CPUs Date: Thu, 14 May 2026 20:52:01 +0530 Message-ID: <20260514152204.481115-18-sshegde@linux.ibm.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260514152204.481115-1-sshegde@linux.ibm.com> References: <20260514152204.481115-1-sshegde@linux.ibm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-Reinject: loops=2 maxloops=12 X-Proofpoint-ORIG-GUID: cdjLVNpOpc6yA3flq6KLHFXnYM9WJGk- X-Proofpoint-GUID: 1mf0-zjIFNlPAxs_4o5XWRD0Fzd6TJFA X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNTE0MDE1MyBTYWx0ZWRfX40DYM0AzHDpg WGhRHHFGorylOOLeGt+/O2uY3olcyRRZoKJj+bhKC7ockgaC+M83ViDUGjCPV1Na3blMDCjW5mx eRrIqlgrYK7pJ1oXE4psPsk4b63Y0y2dJSDN0hdcUlReQAXafOKKBNEdz5xJow60WqPoTxp0YfL txoFJAQC2smG0dOdxlz/k9AndPjfhZcQRJGZ17drWlv5odoimG8r5bAlwXjPrcZzO6N0dFuIhDu 7MhGoOjt5q4SXh3q3zZoTvA0vkKKLrYqI2ifqUbK3O9KY7efGr2HW/wSRVpa/+lH16L+lEu2EsR q/mSTfs3xiLwp5E5fSiqIqcEyD77Sffv7hJ8KsJq5bMBesatEDvumhZND3pbMcVx3MtDjUiqI/o E+EW+hrkhsLotjgPqiW4XmBeiOlqFdwVJt871zk0DfsNUtrPVpmniL1adybdxjWQ553joOkndtx OGZgbDaj1OuBQH/045Q== X-Authority-Analysis: v=2.4 cv=P8UKQCAu c=1 sm=1 tr=0 ts=6a05e969 cx=c_pps a=bLidbwmWQ0KltjZqbj+ezA==:117 a=bLidbwmWQ0KltjZqbj+ezA==:17 a=NGcC8JguVDcA:10 a=VkNPw1HP01LnGYTKEx00:22 a=RnoormkPH1_aCDwRdu11:22 a=RzCfie-kr_QcCd8fBx8p:22 a=VnNF1IyMAAAA:8 a=pGLVZ1KIGr3R53PB0VQA:9 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-05-14_03,2026-05-13_01,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 priorityscore=1501 phishscore=0 bulkscore=0 clxscore=1015 malwarescore=0 spamscore=0 lowpriorityscore=0 suspectscore=0 impostorscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2605050000 definitions=main-2605140153 Define default handlers for high/low steal time. If arch has better decision logic, may override the default implementation. - If the steal time higher than threshold, reduce the number of preferred CPUs by 1 core. The last core in the intersection of online and preferred CPUs will be marked as non-preferred. Ensure at least one core is left as preferred always. - If the steal time lower than threshold, increase the number of preferred CPUs by 1 core. First online core which is not in cpu_preferred_mask will be marked as preferred. If all cores are already set to preferred, bail out. Increase/Decrease may need to modify the splicing across NUMA nodes. It is being kept simple for now. Signed-off-by: Shrikanth Hegde --- include/linux/sched.h | 2 ++ kernel/sched/core.c | 58 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 60 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index 738f17d63943..2afbcd70f0ac 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2529,6 +2529,8 @@ struct steal_monitor_t { }; extern struct steal_monitor_t steal_mon; +void arch_dec_preferred_cpus(struct steal_monitor_t *sm, u64 steal_ratio); +void arch_inc_preferred_cpus(struct steal_monitor_t *sm, u64 steal_ratio); #endif #endif diff --git a/kernel/sched/core.c b/kernel/sched/core.c index a3f65e9c7d30..195e3648b1b5 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -11368,6 +11368,64 @@ void sched_init_steal_monitor(void) steal_mon.sampling_period_ms = 1000; /* once per second */ } +/* + * Default implementation of decrementing the preferred CPUs based on steal + * time. This is simple logic and decrease the preferred CPUs by 1 core. + * It takes out the last core in the online & preferred. + * + * Ensure at least one housekeeping core is always kept as preferred + * + * Could be overwritten by arch specific handling. + */ +#ifndef arch_dec_preferred_cpus +void arch_dec_preferred_cpus(struct steal_monitor_t *sm, u64 steal_ratio) +{ + int last_cpu, tmp_cpu; + int this_cpu = raw_smp_processor_id(); + + cpumask_and(sm->tmp_mask, cpu_online_mask, cpu_preferred_mask); + last_cpu = cpumask_last(sm->tmp_mask); + + /* + * If the core belongs to the housekeeping CPUs, no action is + * taken. This leaves at least one core preferred always. + * This ensures at least some CPUs are available to run + */ + if (cpumask_equal(cpu_smt_mask(last_cpu), cpu_smt_mask(this_cpu))) + return; + + for_each_cpu_and(tmp_cpu, cpu_smt_mask(last_cpu), cpu_online_mask) { + set_cpu_preferred(tmp_cpu, false); + if (tick_nohz_full_cpu(tmp_cpu)) + tick_nohz_dep_set_cpu(tmp_cpu, TICK_DEP_BIT_SCHED); + } +} +#endif + +/* + * Default implementation of incrementing preferred CPUs based on steal + * time. This is simple logic and increases the preferred CPUs by 1 core. + * It adds the first core in online & !preferred + * + * Nothing to do if online == preferred + * + * Could be overwritten by arch specific handling. + */ +#ifndef arch_inc_preferred_cpus +void arch_inc_preferred_cpus(struct steal_monitor_t *sm, u64 steal_ratio) +{ + int first_cpu, tmp_cpu; + + first_cpu = cpumask_first_andnot(cpu_online_mask, cpu_preferred_mask); + /* All CPUs are preferred. Nothing to increase further */ + if (first_cpu >= nr_cpu_ids) + return; + + for_each_cpu_and(tmp_cpu, cpu_smt_mask(first_cpu), cpu_online_mask) + set_cpu_preferred(tmp_cpu, true); +} +#endif + /* This is only a skeleton. Subsequent patches introduce more of it */ void sched_steal_detection_work(struct work_struct *work) { -- 2.47.3