From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1352430149F for ; Fri, 19 Dec 2025 13:56:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.12 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766152596; cv=none; b=imtpNplYonGHGI3wR1Z+bBgY3A3TKpV+6hjCeXOFDugYO2fBOie14iyyIycYMJuZm/xLVQQqn/Av+17IRBugiIJXoCUNzy5YPMvM2Z3rHXMRXDBBhilS7LpsghZq7ucBiZ6o6VAdi8iMWQjmHSNlTocjCuzUARwZHWZjJ+bCKwA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766152596; c=relaxed/simple; bh=qqIhk1BZcCEV9xhSTgx4AiBrUsIflidriBYWUA+5npg=; h=Message-ID:Date:MIME-Version:Subject:To:References:Cc:From: In-Reply-To:Content-Type; b=feCbl8x8NHzzISgNA/Wxqe0TdqU4eNxAqGu4zP4QiPI8MVKU/KMF1fWgJ2busyEsZiXJUwLuInxaFBlr/Xin5jBcYaj8EuXB41fWwdMT+jSGpC6sok9kuwuBAnOexkWW99+4RIzBI9ZzmGJTfRE1bdec+U9SuHgR8tWelgWtDdw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=GP3Wn1qp; arc=none smtp.client-ip=192.198.163.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="GP3Wn1qp" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1766152595; x=1797688595; h=message-id:date:mime-version:subject:to:references:cc: from:in-reply-to:content-transfer-encoding; bh=qqIhk1BZcCEV9xhSTgx4AiBrUsIflidriBYWUA+5npg=; b=GP3Wn1qpFoI/nPXnJPz4Vj4wxaiT2xLNhm4qD0V8D7hwf9FD+ablwDTO iRERfyYFt+w4Wq9HXezqY9qi/mNVnp0KqQR/LDDcfLlmHtXWy1ifKCZ7U 3JB4c5Nkd5ZkAN53Q79RzF5m6X3g/16ioD0uiCzPyLJ+vzNUzhNVyhysm +sk2xxi092Dqg8dp9l3BAU56TqzoJV9jC5S9M9mRbFPJhr4EZD6oTutdR v6vg4MeEoq6D66s8lhApLnWu93sZ/YT2jtZVG6DKzL/RoUx1HoAcmSL7p ljUmkAGNMuZFX3xcJ/QlwTWcQrmMg8GgwYFpa8CsrcnO87M4vMZq+ILOT Q==; X-CSE-ConnectionGUID: 1Bl6QyvxT/+5Du5j94p8Bw== X-CSE-MsgGUID: hoLMJdgvRDOcHqNsrlD61A== X-IronPort-AV: E=McAfee;i="6800,10657,11646"; a="71972744" X-IronPort-AV: E=Sophos;i="6.21,161,1763452800"; d="scan'208";a="71972744" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by fmvoesa106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2025 05:56:35 -0800 X-CSE-ConnectionGUID: 5xPTbGlnTJyJLV6U2+gUYQ== X-CSE-MsgGUID: 6lx4be8kR1W6JCNrmuVCTg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.21,161,1763452800"; d="scan'208";a="222286136" Received: from guowangy-mobl.ccr.corp.intel.com (HELO [10.124.244.130]) ([10.124.244.130]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Dec 2025 05:56:31 -0800 Message-ID: <23a1e55f-bf03-4c05-a954-28326b59b06b@intel.com> Date: Fri, 19 Dec 2025 21:56:18 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] sched/fair: Avoid false sharing in nohz struct To: Shubhang Kaushik Prasanna Kumar References: Content-Language: en-US Cc: benjamin.lei@intel.com, bsegall@google.com, dietmar.eggemann@arm.com, juri.lelli@redhat.com, linux-kernel@vger.kernel.org, mgorman@suse.de, mingo@redhat.com, peterz@infradead.org, rostedt@goodmis.org, tianyou.li@intel.com, tim.c.chen@linux.intel.com, vincent.guittot@linaro.org, vschneid@redhat.com From: "Guo, Wangyang" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 12/19/2025 9:38 AM, Shubhang Kaushik Prasanna Kumar wrote: > While the intuition behind isolating the `nr_cpus` counter seems correct, could you please justify the added padding ? As this is a high contention path in the scheduler, we shouldn't be inflating global structures with padding on logic alone. I’d like to see some benchmarking such as `perf c2c` results from a multi-core system proving the `false sharing` scenario as a measurable bottleneck. Here is the cache line view in c2c report: Num RmtHitm LclHitm Offset records Symbol 6.25% 0.00% 0.00% 0x0 4 [k] _nohz_idle_balance.isra.0 18.75% 100.00% 0.00% 0x8 14 [k] nohz_balance_exit_idle 6.25% 0.00% 0.00% 0x8 8 [k] nohz_balance_enter_idle 6.25% 0.00% 0.00% 0xc 8 [k] sched_balance_newidle 6.25% 0.00% 0.00% 0x10 31 [k] nohz_balancer_kick 6.25% 0.00% 0.00% 0x20 16 [k] sched_balance_newidle 37.50% 0.00% 0.00% 0x38 50 [k] irqtime_account_irq 6.25% 0.00% 0.00% 0x38 47 [k] account_process_tick 6.25% 0.00% 0.00% 0x38 12 [k] account_idle_ticks Offsets: * 0x0 -- nohz.idle_cpu_mask (r) * 0x8 -- nohz.nr_cpus (w) * 0x38 -- sched_clock_irqtime (r), not in nohz, but share cacheline The layout in /proc/kallsyms can also confirm that: ffffffff88600d40 b nohz ffffffff88600d68 B arch_needs_tick_broadcast ffffffff88600d6c b __key.264 ffffffff88600d6c b __key.265 ffffffff88600d70 b dl_generation ffffffff88600d78 b sched_clock_irqtime In perf cycle hotspots, we noticed that irqtime_account_irq has 3% overhead caused by false sharing. With the patch applied, that hotspot disappears. > I am also concerned about the internal layout. By sandwiching the timer fields between two `__cacheline_aligned` boundaries, we might just be shifting the contention rather than fixing it. See to it that fields like `next_balance` aren't being squeezed into a new conflict zone. Would like to review the benchmark data and the struct layout before we move forward. Before the patch, all the nohz members stay in the same cache line. After layout fix, fields like `next_balance` still in the same cache line, no new conflict zone created. Since nohz is global, the size inflating is minimal - less than 64 bytes in total. BR Wangyang