From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4FD9F3B186 for ; Fri, 3 Oct 2025 16:37:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759509466; cv=none; b=E2umdtPFWRRkirJQsxV0QDbfplGmdTO72NXR68aKN8rWcYcyApkq8RaqP2D3jrYZFLUON2tBWLUwPOCCYElDs1IyOcyG6HTZQ4MsQaLZOkMQQ7hcI4l37SQI3TAX52jMF558NElI8WBfVvBodGVB8cGNy9yNnf/Wby28lJpyezw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759509466; c=relaxed/simple; bh=InD8WtbYkgU0kiUNWwNRVZek562u2vSXopzwXRbkJQ4=; h=Message-ID:Subject:From:To:Cc:Date:In-Reply-To:References: Content-Type:MIME-Version; b=eKfrAqVTIZq6e6khU6k0m33QY1vO/iXAH6SVa797fT00etn5aexGf6Eot8AAeFeWDKb+rqEIQZhJcUguND3af1JLv4ZU1V+uVTEOXkFo5YVEXDFpG6kXm+IeZ3vj/1A7SNbWysnxHf7S9M2mV2vi/JBiNxvLs/lfLj7u70tF7Ns= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=N72LcAeQ; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="N72LcAeQ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1759509464; x=1791045464; h=message-id:subject:from:to:cc:date:in-reply-to: references:content-transfer-encoding:mime-version; bh=InD8WtbYkgU0kiUNWwNRVZek562u2vSXopzwXRbkJQ4=; b=N72LcAeQX6G+CarBG3PLNoChFmxeqC07r8RRJCSo6BLWXxaCzxgzC3Ea 1OXsVxKU1LLcP+CK1WLs99C7ZIO7kntJH+aGhQ0a0KB7YTFjtYyFAHar9 CXJrdkmVDLwsPxcVjwdmKdqweJusl+jKjRCf1zYwv12ZFAroYmSSnzDBa qHmxMkMWqPDL4ilPQSrr1SbdllafHwps4vTZ/tlQfD5QZMUX3OUadDjK6 LgZy/m9nodBT78Ly2ripyE0zdosmKcbjyRSHjm5KhMBu03GFUN4Qvao3x WwsFMpvzLSGkcF3R7y6OpGJ0ayTXP/l1ggUK74qfBJdYzauiS5cZyp2Sf w==; X-CSE-ConnectionGUID: 9ADGpN+FQAOLQ5VpG4JVYg== X-CSE-MsgGUID: wgULU/yTQ/KDz1zUK10Ijg== X-IronPort-AV: E=McAfee;i="6800,10657,11571"; a="73147847" X-IronPort-AV: E=Sophos;i="6.18,313,1751266800"; d="scan'208";a="73147847" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2025 09:37:43 -0700 X-CSE-ConnectionGUID: MzTqZt5jSySTlLC/sLwt0Q== X-CSE-MsgGUID: 2uOtgFLXS1uCs1A09TkO4w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.18,313,1751266800"; d="scan'208";a="183349190" Received: from vverma7-desk1.amr.corp.intel.com (HELO [10.125.110.60]) ([10.125.110.60]) by ORVIESA003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2025 09:37:43 -0700 Message-ID: <16f4c4312978bc1093df4cdba2f352fee33f8927.camel@linux.intel.com> Subject: Re: [RESEND PATCH] sched/fair: Skip sched_balance_running cmpxchg when balance is not due From: Tim Chen To: Shrikanth Hegde , Peter Zijlstra Cc: Ingo Molnar , Chen Yu , Doug Nelson , Mohini Narkhede , linux-kernel@vger.kernel.org, Vincent Guittot , K Prateek Nayak Date: Fri, 03 Oct 2025 09:37:42 -0700 In-Reply-To: <204e1921-f3e3-41cf-bae7-36884f50503b@linux.ibm.com> References: <204e1921-f3e3-41cf-bae7-36884f50503b@linux.ibm.com> Autocrypt: addr=tim.c.chen@linux.intel.com; prefer-encrypt=mutual; keydata=mQENBE6N6zwBCADFoM9QBP6fLqfYine5oPRtaUK2xQavcYT34CBnjTlhbvEVMTPlNNzE5 v04Kagcvg5wYcGwr3gO8PcEKieftO+XrzAmR1t3PKxlMT1bsQdTOhKeziZxh23N+kmA7sO/jnu/X2 AnfSBBw89VGLN5fw9DpjvU4681lTCjcMgY9KuqaC/6sMbAp8uzdlue7KEl3/D3mzsSl85S9Mk8KTL MLb01ILVisM6z4Ns/X0BajqdD0IEQ8vLdHODHuDMwV3veAfnK5G7zPYbQUsK4+te32ruooQFWd/iq Rf815j6/sFXNVP/GY4EWT08UB129Kzcxgj2TEixe675Nr/hKTUVKM/NrABEBAAGJAS4EIAECABgFA k6ONYoRHQFLZXkgaXMgcmVwbGFjZWQACgkQHH3vaoxLv2UmbAgAsqa+EKk2yrDc1dEXbZBBGeCiVP XkP7iajI/FiMVZHFQpme4vpntWhg0BIKnF0OSyv0wgn3wzBWx0Zh3cve/PICIj268QvXkb0ykVcIo RnWwBeavO4dd304Mzhz5fBzJwjYx06oabgUmeGawVCEq7UfXy+PsdQdoTabsuD1jq0MbOL/4sB6CZ c4V2mQbW4+Js670/sAZSMj0SQzK9CQyQdg6Wivz8GgTBjWwWsfMt4g2u0s6rtBo8NUZG/yw6fNdao DaT/OCHuBopGmsmFXInigwOXsjyp15Yqs/de3S2Nu5NdjJUwmN1Qd1bXEc/ItvnrFB0RgoNt2gzf2 5aPifLabQlVGltIENoZW4gPHRpbS5jLmNoZW5AbGludXguaW50ZWwuY29tPokBOAQTAQIAIgUCTo3 rPAIbAwYLCQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQHH3vaoxLv2XYdAf8DgRO4eIAtWZy4zLv 0EZHWiJ35GYAQ5fPFWBoNURE0+vICrvLyfCKTlUTFxFxTiAWHUO7JM+uBHQSJVsE+ERmTPsiUO1m7 SxZakGy9U2WOEiWMZMRp7HZE8vPUY5AM1OD0b38WBeUD3FPx5WRlQ0z6izF9aIHxoQhci0/WtmGLO Pw3HUlCy1c4DDl6cInpy/JqUPcYlvsp+bWbdm7R5b33WW2CNVVr1eLj+1UP0Iow4jlLzNLW+jOpiv LDs3G/bNC1Uu/SAzTvbaDBRRO9ToX5rlg3Zi8PmOUXWzEfO6N+L1gFCAdYEB4oSOghSbk2xCC4DRl UTlYoTJCRsjusXEy4ZkCDQROjjboARAAtXPJWkNkK3s22BXrcK8w9L/Kzqmp4+V9Y5MkkK94Zv66l XAybnXH3UjL9ATQgo7dnaHxcVX0S9BvHkEeKqEoMwxg86Bb2tzY0yf9+E5SvTDKLi2O1+cd7F3Wba 1eM4Shr90bdqLHwEXR90A6E1B7o4UMZXD5O3MI013uKN2hyBW3CAVJsYaj2s9wDH3Qqm4Xe7lnvTA GV+zPb5Oj26MjuD4GUQLOZVkaA+GX0TrUlYl+PShJDuwQwpWnFbDgyE6YmlrWVQ8ZGFF/w/TsRgJM ZqqwsWccWRw0KLNUp0tPGig9ECE5vy1kLcMdctD+BhjF0ZSAEBOKyuvQQ780miweOaaTsADu5MPGk d3rv7FvKdNencd+G1BRU8GyCyRb2s6b0SJnY5mRnE3L0XfEIJoTVeSDchsLXwPLJy+Fdd2mTWQPXl nforgfKmX6BYsgHhzVsy1/zKIvIQey8RbhBp728WAckUvN47MYx9gXePW04lzrAGP2Mho+oJfCpI0 myjpI9CEctvJy4rBXRgb4HkK72i2gNOlXsabZqy46dULcnrMOsyCXj6B1CJiZbYz4xb8n5LiD31SA fO5LpKQe/G4UkQOZgt+uS7C0Zfp61+0mrhKPG+zF9Km1vaYNH8LIsggitIqE05uCFi9sIgwez3oiU rFYgTkTSqMQNPdweNgVhSUAEQEAAbQ0VGltIENoZW4gKHdvcmsgcmVsYXRlZCkgPHRpbS5jLmNoZW 5AbGludXguaW50ZWwuY29tPokCVQQTAQgAPwIbAwYLCQgHAwIGFQgCCQoLBBYCAwECHgECF4AWIQT RofI2lb24ozcpAhyiZ7WKota4SQUCYjOVvwUJF2fF1wAKCRCiZ7WKota4SeetD/4hztE+L/Z6oqIY lJJGgS9gjV7c08YH/jOsiX99yEmZC/BApyEpqCIs+RUYl12hwVUJc++sOm/p3d31iXvgddXGYxim0 0+DIhIu6sJaDzohXRm8vuB/+M/Hulv+hTjSTLreAZ9w9eYyqffre5AlEk/hczLIsAsYRsqyYZgjfX Lk5JN0L7ixsoDRQ5syZaY11zvo3LZJX9lTw0VPWlGeCxbjpoQK91CRXe9dx/xH/F/9F203ww3Ggt4 VlV6ZNdl14YWGfhsiJU2rbeJ930sUDbMPJqV60aitI93LickNG8TOLG5QbN9FzrOkMyWcWW7FoXwT zxRYNcMqNVQbWjRMqUnN6PXCIvutFLjLF6FBe1jpk7ITlkS1FvA2rcDroRTU/FZRnM1k0K4GYYYPj 11Zt3ZBcPoI0J3Jz6P5h6fJioqlhvZiaNhYneMmfvZAWJ0yv+2c5tp2aBmKsjmnWecqvHL5r/bXez iKRdcWyXqrEEj6OaJr3S4C0MIgGLteARvbMH+3tNTDIqFuyqdzHLKwEHuvKxHzYFyV7I5ZEQ2HGH5 ZRZ2lRpVjSIlnD4L1PS6Bes+ALDrWqksbEuuk+ixFKKFyIsntIM+qsjkXseuMSIG5ADYfTla9Pc5f VpWBKX/j0MXxdQsxT6tiwE7P+osbOMwQ6Ja5Qi57hj8jBRF1znDjDZkBDQRcCwpgAQgAl12VXmQ1X 9VBCMC+eTaB0EYZlzDFrW0GVmi1ii4UWLzPo0LqIMYksB23v5EHjPvLvW/su4HRqgSXgJmNwJbD4b m1olBeecIxXp6/S6VhD7jOfi4HACih6lnswXXwatzl13OrmK6i82bufaXFFIPmd7x7oz5Fuf9OQlL OnhbKXB/bBSHXRrMCzKUJKRia7XQx4gGe+AT6JxEj6YSvRT6Ik/RHpS/QpuOXcziNHhcRPD/ZfHqJ SEa851yA1J3Qvx1KQK6t5I4hgp7zi3IRE0eiObycHJgT7nf/lrdAEs7wrSOqIx5/mZ5eoKlcaFXiK J3E0Wox6bwiBQXrAQ/2yxBxVwARAQABtCVUaW0gQ2hlbiA8dGltLmMuY2hlbkBsaW51eC5pbnRlbC 5jb20+iQFUBBMBCAA+FiEEEsKdz9s94XWwiuG96lQbuGeTCYsFAlwLCmACGwMFCQHhM4AFCwkIBwI GFQoJCAsCBBYCAwECHgECF4AACgkQ6lQbuGeTCYuQiQf9G2lkrkRdLjXehwCl+k5zBkn8MfUPi2It U2QDcBit/YyaZpNlSuh8h30gihp5Dlb9BnqBVKxooeIVKSKC1HFeG0AE28TvgCgEK8qP/LXaSzGvn udek2zxWtcsomqUftUWKvoDRi1AAWrPQmviNGZ4caMd4itKWf1sxzuH1qF5+me6eFaqhbIg4k+6C5 fk3oDBhg0zr0gLm5GRxK/lJtTNGpwsSwIJLtTI3zEdmNjW8bb/XKszf1ufy19maGXB3h6tA9TTHOF nktmDoWJCq9/OgQS0s2D7W7f/Pw3sKQghazRy9NqeMbRfHrLq27+Eb3Nt5PyiQuTE8JeAima7w98q uQ== Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.56.2 (3.56.2-2.fc42) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 On Fri, 2025-10-03 at 10:53 +0530, Shrikanth Hegde wrote: >=20 > On 10/3/25 4:30 AM, Tim Chen wrote: > > Repost comments: > >=20 > > There have been past discussions about avoiding serialization in load > > balancing, but no objections were raised to this patch itself during > > its last posting: > > https://lore.kernel.org/lkml/20250416035823.1846307-1-tim.c.chen@linux.= intel.com/ > >=20 > > Vincent and Chen Yu have already provided their Reviewed-by tags. > >=20 > > We recently encountered this issue again on a 2-socket, 240-core > > Clearwater Forest server running SPECjbb. In this case, 14% of CPU > > cycles were wasted on unnecessary acquisitions of > > sched_balance_running. This reinforces the need for the change, and we > > hope it can be merged. > >=20 > > Tim > >=20 > > --- > >=20 > > During load balancing, balancing at the LLC level and above must be > > serialized. The scheduler currently checks the atomic > > `sched_balance_running` flag before verifying whether a balance is > > actually due. This causes high contention, as multiple CPUs may attempt > > to acquire the flag concurrently. > >=20 > > On a 2-socket Granite Rapids system with sub-NUMA clustering enabled > > and running OLTP workloads, 7.6% of CPU cycles were spent on cmpxchg > > operations for `sched_balance_running`. In most cases, the attempt > > aborts immediately after acquisition because the load balance time is > > not yet due. > >=20 > > Fix this by checking whether a balance is due *before* trying to > > acquire `sched_balance_running`. This avoids many wasted acquisitions > > and reduces the cmpxchg overhead in `sched_balance_domain()` from 7.6% > > to 0.05%. As a result, OLTP throughput improves by 11%. > >=20 > > Reviewed-by: Chen Yu > > Reviewed-by: Vincent Guittot > > Signed-off-by: Tim Chen > > --- >=20 > Hi Tim. >=20 > Fine by me. unnecessary atomic operations do hurt on large systems. > The further optimization that i pointed out can come in later i guess. > That would help only further. this should be good to begin with. Thanks for your review and your past comments. We'll look into further optimization if we find that this became a hot path again. For now this change seemed to be good enough. Tim >=20 > With that. > Reviewed-by: Shrikanth Hegde >=20