From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F1A9156C6A for ; Tue, 5 May 2026 04:11:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.158.5 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777954280; cv=none; b=u5qAMaaUiN+JhZLV/PNQuO0ydJLITN25gqwCk1VSyMOS5MJM1vr1AzTisG4xtA36IW4+BRzIl0I9j1cNunLSCazNQXxR1WGhUzgp8eIjVANP/t5GweYpZQw7Ubt3wTcx4jh65iAPG9pwe/1oKmnH/YlQl2u9r+WNGo6eT/3/3Ok= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777954280; c=relaxed/simple; bh=m9Gm41+E5144W16kEJjyFS6e4k6jiNQZsyLWlTF39+M=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=ABqEhNoTdn7itk+MDcZ+CJwzvIvlVUs6lkhTYgTITZMhojI48S1g4lxHin9Gx5a1n/cb/02ZPT7mp+DfccezS0nAyw3bo4arc7YO4DL0P3Pnn+/x9uj0wvEdSEqd8agAFrllD7Ubj8CjPzQOWUue6Rqo6n50TxstD4jlO41CX4k= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=qe8ktwON; arc=none smtp.client-ip=148.163.158.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="qe8ktwON" Received: from pps.filterd (m0356516.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 644L4ITe1437916; Tue, 5 May 2026 04:07:35 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pp1; bh=1mPiS4 WA1SwUuVDJIQNwKvWK8lrqd3CwkJZysAiiqLY=; b=qe8ktwONmNC6aN9fHBdev2 i5+H631DbnLWbs12fbRVpW7m3XiKWLfr28YkcO71vVQ1t/zmMFQ4+gFnyD3CDJW0 scI6PtHH0nZxO/y9p5mJGuCbipx4mlZTb2BAx5ELrEdK12klrxFdemqv505O+zl9 bODeiQpxIJhD0FXHRcvxMPNsR0aZ/HFjn5X0Mj6zPCDDX3IpnaqVFGEYo0xHOPIa Ogyl91KuNSUCNPKs6bBzdDmurEwW6ne6vFqleMvgIE4lsXjStQUq4ZaVtJBiBhby hPZA02YueDyu3oiHisvrE/9eoztkKmyIt7c6Bq9DzAni+WRZfVFWgoKDNArIPAZA == Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4dw9w69mda-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 05 May 2026 04:07:34 +0000 (GMT) Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.18.1.7/8.18.1.7) with ESMTP id 6453svRM030240; Tue, 5 May 2026 04:07:34 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 4dww3gyw1d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 05 May 2026 04:07:34 +0000 (GMT) Received: from smtpav02.fra02v.mail.ibm.com (smtpav02.fra02v.mail.ibm.com [10.20.54.101]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 64547Una31719990 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 5 May 2026 04:07:30 GMT Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 546462004B; Tue, 5 May 2026 04:07:30 +0000 (GMT) Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D073D20043; Tue, 5 May 2026 04:07:22 +0000 (GMT) Received: from [9.124.209.62] (unknown [9.124.209.62]) by smtpav02.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 5 May 2026 04:07:22 +0000 (GMT) Message-ID: Date: Tue, 5 May 2026 09:37:21 +0530 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 03/17] cpumask: Introduce cpu_preferred_mask To: Yury Norov Cc: linux-kernel@vger.kernel.org, mingo@kernel.org, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, tglx@linutronix.de, yury.norov@gmail.com, gregkh@linuxfoundation.org, pbonzini@redhat.com, seanjc@google.com, kprateek.nayak@amd.com, vschneid@redhat.com, iii@linux.ibm.com, huschle@linux.ibm.com, rostedt@goodmis.org, dietmar.eggemann@arm.com, mgorman@suse.de, bsegall@google.com, maddy@linux.ibm.com, srikar@linux.ibm.com, hdanton@sina.com, chleroy@kernel.org, vineeth@bitbyteword.org, joelagnelf@nvidia.com References: <20260407191950.643549-1-sshegde@linux.ibm.com> <20260407191950.643549-4-sshegde@linux.ibm.com> <0d8412de-e18a-476f-9eb6-9a977f4474a3@linux.ibm.com> From: Shrikanth Hegde Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-Reinject: loops=2 maxloops=12 X-Authority-Analysis: v=2.4 cv=XPQAjwhE c=1 sm=1 tr=0 ts=69f96d07 cx=c_pps a=3Bg1Hr4SwmMryq2xdFQyZA==:117 a=3Bg1Hr4SwmMryq2xdFQyZA==:17 a=IkcTkHD0fZMA:10 a=NGcC8JguVDcA:10 a=VkNPw1HP01LnGYTKEx00:22 a=RnoormkPH1_aCDwRdu11:22 a=Y2IxJ9c9Rs8Kov3niI8_:22 a=NEAV23lmAAAA:8 a=Kmg43nRoTl0uddKVUbwA:9 a=QEXdDO2ut3YA:10 X-Proofpoint-ORIG-GUID: 336Npew4TPKpsJCvvZyY7mbQ-KgKOjdd X-Proofpoint-GUID: 0aazON6JIIUcBXwPUaHJznCNOg8idFSH X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNTA1MDAzNSBTYWx0ZWRfX+mnLL8z6xit6 9Ed9HBMKuPt0xFo3Y/RmVvkRpYjMABZiebID1jH4kRrw0GbpRwVBizFDROLS0/M0f9nH1b5JooC 7/MKiiwtzyeMUQau1j0IrYG2vHe4+rj0FJ+jyeSWTz19fXoTBFfU/OQ4a+whccoh+JORqGUvwHF UcWuOvrQVH+O+ixR9pt8Uw0EbV8GINETBDjA1RsHV2xaaNHHHOrsU1zXBHlAsR2X+gbguxxpdUR NOl3iMikLWVOw2n56hcVPtU8hg7PnSDb1QkAQIazgc3Ze+hddZuDRN956MoBHRA5XuDulqr0JN2 H3XWvOUsquvEBFCIzXwMm5uDMtcDbIVfDbGwKTz/u9sOYxM4inONXL0EScTfF+IY72Yw361L3zz 2MxKTg76mc/O01cPW5j1Sury0V6/Y+bahvXE9XZqiSluYNpgsIqFo1IefFY+LGtnoBDIQyfhsVh qh/7janL3TJd35+aU8w== X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-05-05_01,2026-04-30_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 lowpriorityscore=0 suspectscore=0 adultscore=0 spamscore=0 priorityscore=1501 impostorscore=0 phishscore=0 malwarescore=0 clxscore=1015 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2604200000 definitions=main-2605050035 Hi Yury. On 4/8/26 11:27 PM, Yury Norov wrote: > I suggest adding, for example, config PREFERRED_CPUS that would select > PARAVIRT, and would be disabled by default. > > Regardless, whatever you decide, please keep all the cpu_paravirt_mask > ifdefery on the cpumasks level. For example, in patch #5: > > +#ifdef CONFIG_PARAVIRT > +static inline bool task_can_run_on_preferred_cpu(struct task_struct *p) > +{ > + return cpumask_intersects(p->cpus_ptr, cpu_preferred_mask); > +} > +#else > +static inline bool task_can_run_on_preferred_cpu(struct task_struct *p) > +{ > + return true; > +} > +#endif > > That looks wrong to me. Instead, either declare cpu_preferred_mask > unconditionally, and maintain it well, or > > +#ifdef CONFIG_PREFERRED_CPUS > +extern struct cpumask __cpu_preferred_mask; > +#else > +#define __cpu_preferred_mask __cpu_online_mask > +#endif > > This way, your higher level code will be clean. > > Thanks, > Yury Thanks Yury for the suggestion. This method is indeed cleaner. So I have made it as below. It is rough patch, i may have to clean it up still. But this helps to get much of ifdeffery elsewhere. Most of the ifdeffery will be in cpumask.h, sched.h. Hopefully I should be able to send the series by this week. --- diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt index 88c594c6d7fc..c62001b52fab 100644 --- a/kernel/Kconfig.preempt +++ b/kernel/Kconfig.preempt @@ -192,3 +192,17 @@ config SCHED_CLASS_EXT For more information: Documentation/scheduler/sched-ext.rst https://github.com/sched-ext/scx + +config PREFERRED_CPU + bool "Dynamic vCPU management based on steal time" + default y if PARAVIRT + help + This feature helps to reduce the steal time in paravirtualised + environment, there by reducing vCPU preemption. Reducing vCPU + preemption provides improved lock holder preemption and reduces + cost of vCPU preemption in the host. diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index 80211900f373..577b8d992a45 100644 --- a/include/linux/cpumask.h +++ b/include/linux/cpumask.h @@ -120,12 +120,20 @@ extern struct cpumask __cpu_enabled_mask; extern struct cpumask __cpu_present_mask; extern struct cpumask __cpu_active_mask; extern struct cpumask __cpu_dying_mask; + +#ifdef CONFIG_PREFERRED_CPU +extern struct cpumask __cpu_preferred_mask; +#else +#define __cpu_preferred_mask __cpu_online_mask +#endif + #define cpu_possible_mask ((const struct cpumask *)&__cpu_possible_mask) #define cpu_online_mask ((const struct cpumask *)&__cpu_online_mask) #define cpu_enabled_mask ((const struct cpumask *)&__cpu_enabled_mask) #define cpu_present_mask ((const struct cpumask *)&__cpu_present_mask) #define cpu_active_mask ((const struct cpumask *)&__cpu_active_mask) #define cpu_dying_mask ((const struct cpumask *)&__cpu_dying_mask) +#define cpu_preferred_mask ((const struct cpumask *)&__cpu_preferred_mask) extern atomic_t __num_online_cpus; extern unsigned int __num_possible_cpus; @@ -1164,6 +1172,7 @@ void init_cpu_possible(const struct cpumask *src); void set_cpu_online(unsigned int cpu, bool online); void set_cpu_possible(unsigned int cpu, bool possible); +void set_cpu_preferred(unsigned int cpu, bool preferred); /** * to_cpumask - convert a NR_CPUS bitmap to a struct cpumask * @@ -1256,7 +1265,12 @@ static __always_inline bool cpu_dying(unsigned int cpu) return cpumask_test_cpu(cpu, cpu_dying_mask); } -#else +static __always_inline bool cpu_preferred(unsigned int cpu) +{ + return cpumask_test_cpu(cpu, cpu_preferred_mask); +} + +#else /* NR_CPUS <= 1 */ #define num_online_cpus() 1U #define num_possible_cpus() 1U @@ -1294,6 +1308,11 @@ static __always_inline bool cpu_dying(unsigned int cpu) return false; } +static __always_inline bool cpu_preferred(unsigned int cpu) +{ + return false; +} + #endif /* NR_CPUS > 1 */ #define cpu_is_offline(cpu) unlikely(!cpu_online(cpu)) diff --git a/kernel/cpu.c b/kernel/cpu.c index bc4f7a9ba64e..7787c907f9b8 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -3107,6 +3107,11 @@ EXPORT_SYMBOL(__cpu_dying_mask); atomic_t __num_online_cpus __read_mostly; EXPORT_SYMBOL(__num_online_cpus); +#ifdef CONFIG_PREFERRED_CPU +struct cpumask __cpu_preferred_mask __read_mostly; +EXPORT_SYMBOL(__cpu_preferred_mask); +#endif + void init_cpu_present(const struct cpumask *src) { cpumask_copy(&__cpu_present_mask, src); @@ -3154,6 +3159,14 @@ void set_cpu_possible(unsigned int cpu, bool possible) } } +void set_cpu_preferred(unsigned int cpu, bool preferred) +{ + if(!IS_ENABLED(CONFIG_PREFERRED_CPU)) + return; + + assign_cpu((cpu), &__cpu_preferred_mask, (preferred)); +} + /* * Activate the first processor. */