From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 528A018AFD for ; Tue, 23 Dec 2025 07:28:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.156.1 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766474890; cv=none; b=jl5reXQGyyWrB552WiJnJq6TXgvIANuzGht2UurLMv3ozwNyt0TsnlCvv+0rUXsO736azNJ+CnswkaPc6QM7l84Nmq1veqTc5R6INzOBsEYhq1NdfX9+Jcp+ZwuafVQ2QRFuEhETvc++LGiv4NIEREmQfvp3T21csF6b0Hbkdj4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766474890; c=relaxed/simple; bh=Ui/f5d1J0FQ3SXAtyOk5oPAkgaGKEco2ThDpRIVqcFA=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=TP5vmzJocxMdfyPyR9MMHg9LKXZnVpi4Q+s6SyBxV3McdEOMyi2HhFlrouVvxNRZ0XxRfmfecWkHoracBkxmz32yQs59Ly00BeqtKYqEvUaJUNgdIIB5jD9GmBSDlsTwOs9u8TwyeTCgmEp7cNCR9rrH75tV8jvOa4kZ5wzITKk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=nIkrV7/7; arc=none smtp.client-ip=148.163.156.1 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="nIkrV7/7" Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 5BMG3sOc004505; Tue, 23 Dec 2025 07:27:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pp1; bh=84RbVb yL+yrUsRSYqsIitn3Sg9RoDkBH+HvRREcqlfk=; b=nIkrV7/7BjXaJEk9pMqpmo ZidkDLRIpqzb27F65UGJKbvh8Wdrk4CJpyVeObOl6jRR6Ninyhe0pyWogzxUfw70 aea/oYV37WatxDv7wrhPV3pzbXEBW0BcxHlNCxhnGQlIbb3/RqTLv2wDNDSCKzQj oX0SMDHIKTR+4eDPw/JKIIGTF9k60Z7ZFw7ZZnZI8Zy/M7XNADN1Oku4QFyySeNu FlvvnHqqgU3B5YrdPt2iK3Pf2hUQEe7fMET+UQFD/ExhAojLimryfc115eRMawOO 1Akc4AsdC67yM+BesEhbQ6xo2RAcCmyvUp2wcBfdd4Fm4UmYA9NS8mqWmpoqESUg == Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4b5kfq3p7v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 23 Dec 2025 07:27:48 +0000 (GMT) Received: from m0353729.ppops.net (m0353729.ppops.net [127.0.0.1]) by pps.reinject (8.18.1.12/8.18.0.8) with ESMTP id 5BN7NkO7020035; Tue, 23 Dec 2025 07:27:47 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4b5kfq3p7s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 23 Dec 2025 07:27:47 +0000 (GMT) Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 5BN6cUEg032222; Tue, 23 Dec 2025 07:27:46 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 4b68u11t28-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 23 Dec 2025 07:27:46 +0000 Received: from smtpav02.fra02v.mail.ibm.com (smtpav02.fra02v.mail.ibm.com [10.20.54.101]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 5BN7Rikd49021306 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 23 Dec 2025 07:27:44 GMT Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B5CB320043; Tue, 23 Dec 2025 07:27:44 +0000 (GMT) Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E311C20040; Tue, 23 Dec 2025 07:27:41 +0000 (GMT) Received: from [9.39.23.97] (unknown [9.39.23.97]) by smtpav02.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 23 Dec 2025 07:27:41 +0000 (GMT) Message-ID: <917c1771-5249-4c10-9ecf-699cdd323cd9@linux.ibm.com> Date: Tue, 23 Dec 2025 12:57:40 +0530 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] sched/fair: Avoid false sharing in nohz struct To: "Guo, Wangyang" Cc: linux-kernel@vger.kernel.org, Benjamin Lei , Tim Chen , Tianyou Li , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider References: <20251211055612.4071266-1-wangyang.guo@intel.com> <7297e5e6-ae5a-42dc-8495-fddbb87ddf87@intel.com> Content-Language: en-US From: Shrikanth Hegde In-Reply-To: <7297e5e6-ae5a-42dc-8495-fddbb87ddf87@intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Authority-Analysis: v=2.4 cv=carfb3DM c=1 sm=1 tr=0 ts=694a4474 cx=c_pps a=aDMHemPKRhS1OARIsFnwRA==:117 a=aDMHemPKRhS1OARIsFnwRA==:17 a=IkcTkHD0fZMA:10 a=wP3pNCr1ah4A:10 a=VkNPw1HP01LnGYTKEx00:22 a=VwQbUJbxAAAA:8 a=pGLkceISAAAA:8 a=VnNF1IyMAAAA:8 a=nxp7bFHwQpnPOoqQTqgA:9 a=3ZKOabzyN94A:10 a=QEXdDO2ut3YA:10 X-Proofpoint-ORIG-GUID: yM81XNrW875PIVgM1AKw94bswc25jbtH X-Proofpoint-GUID: l6rCqDiyt26bCCvsqMTfpwRgezFixMgo X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUxMjIzMDA2MCBTYWx0ZWRfX5qWo2WGTc3YF oqk2HQWfxGvGTM7HvP9Z9zV3JLfDrgtXXG0TLOTJEhZskaGWPyfPhQS0SWT4gia6jDVW76dOx41 uhKmR6+n5g5xDx/NqgQ5xBHZ3KaGFSm4AbFN2/XgfVqs2/1bIXJSgvQ1+z8SGtp/fxLCJVniR0k 9jGJdSM3t7HCyNtBSDSg4bFNfHFhp8qe3FKPrwPlEGWC92Ajull2oItxH79JAP2XxC8QAW1Pud5 O8UeIq/SesKcManQegjh1Hc/oJx3+vr1qS4qeMmpetbFCYyGoL37NqXPaRydMQkccQL/v9xkN8T 3a6nahjwHB3b1ldMKRN0v9xZjMGbaqoQ0wR9B1MFvxsKJ+MZYQO3PD7pPs/r6GTYSJ8FSB+/Hnj tJF++oAHaaXdqQ7dN3HEcxs1iMRR+GbiWT4MBnYpREOECmg28ghOQ1Hhim6ueoyDQ5vbe8pFZxO Cd6uDHWYaEYkZ92go7A== X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1121,Hydra:6.1.9,FMLib:17.12.100.49 definitions=2025-12-23_02,2025-12-22_01,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 bulkscore=0 suspectscore=0 spamscore=0 clxscore=1015 malwarescore=0 impostorscore=0 priorityscore=1501 adultscore=0 phishscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2512120000 definitions=main-2512230060 On 12/22/25 7:51 AM, Guo, Wangyang wrote: > On 12/21/2025 9:05 PM, Shrikanth Hegde wrote: >> Hi Wangyang, >> >> On 12/11/25 11:26 AM, Wangyang Guo wrote: >>> There are two potential false sharing issue in nohz struct: >>> 1. idle_cpus_mask is a read-mostly field, but share the same cacheline >>>     with frequently updated nr_cpus. >> >> Updates to idle_cpus_mask is not same cacheline. it is updated >> alongside nr_cpus. >> >> with CPUMASK_OFFSTACK=y, idle_cpus_mask is a pointer to the actual mask. >> Updates to it happen in another cacheline. >> >> with CPUMASK_OFFSTACK=n, idle_cpus_mask is on the stack and its length >> depends on NR_CPUS. typical value being 512/2048/8192 it can span a few >> cachelines. So updates to it likely in different cacheline compared to >> nr_cpus. >> >> see  https://lore.kernel.org/all/aS6bK4ad-wO2fsoo@gmail.com/ >> > This patch is mainly target for idle_cpus_mask as a pointer, which is > default for many distro OS. > Not all archs. >> >> Likely in your case, nr_cpus updates are the costly ones. >> Try below and see if it helps to fix your issue too. >> https://lore.kernel.org/all/20251201183146.74443-1-sshegde@linux.ibm.com/ >> I Should send out new version soon. >> >>> 2. Data followed by nohz still share the same cacheline and has >>>     potential false sharing issue. >>> >> >> How does your patch handle this? >> I don't see any other struct apart from nohz being changed. > > The data follow by nohz is implicit and determined by compiler. > For example, this is the layout from /proc/kallsyms in my machine: > ffffffff88600d40 b nohz > ffffffff88600d68 B arch_needs_tick_broadcast > ffffffff88600d6c b __key.264 > ffffffff88600d6c b __key.265 > ffffffff88600d70 b dl_generation > ffffffff88600d78 b sched_clock_irqtime > > What we can do is placing read-mostly `idle_cpus_mask` pointer in a new > cacheline, so data followed by nohz would not be affected by nr_cpus. > That's a concern. If it is compiler dependent, then sometime it helps, sometime it wont. It should done other way around rather than changing the nohz. If there is structure which has a lot of read/updates, it should go into its own cacheline rather. i.e in your case sched_clock_irqtime should go into its own cacheline. --- diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 4f97896887ec..29f9438f9f03 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -25,7 +25,7 @@ */ DEFINE_PER_CPU(struct irqtime, cpu_irqtime); -int sched_clock_irqtime; +int sched_clock_irqtime __cacheline_aligned; void enable_sched_clock_irqtime(void) {