From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E0AE3AB267 for ; Wed, 8 Apr 2026 10:16:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=198.175.65.14 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775643417; cv=fail; b=SLClw4J3eHOT8/wL6J3YDsTiH5etiVOEQd/cnnwiEFB4QZvgJq5hsw64q01QxG0GAlNATErbBWHfQK2ILQLwSMuiwhlKHmyAC4aG+Joggf03ZsRL3I5mZ4YR8G8prv4cIr/+DI9MAdJnQiOupTe3U6k9kbzBwT13kzM4PsdJlT8= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775643417; c=relaxed/simple; bh=lfyo4/1/PLP1Abijjo0BQlu8tagcNg3OriJjDECiIN0=; h=Message-ID:Date:Subject:To:CC:References:From:In-Reply-To: Content-Type:MIME-Version; b=SVjPOZhvfPe2xtIL7FnnzDopM5cSIUv4l+8Ni1WDU3lP7zyNPJzGH+/RtpzrwNRaRpzwfZq1Adma0Mvcr9uWUQviL5QKLFPgi4aL+mBZkPNjOT3FcLh4a7GplwWB4aWFJdbBMfgdoZfIQQafTd1V1Pz3mVQgTbt/rDg9rwu/CqE= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=FGTVstQ4; arc=fail smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="FGTVstQ4" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1775643414; x=1807179414; h=message-id:date:subject:to:cc:references:from: in-reply-to:content-transfer-encoding:mime-version; bh=lfyo4/1/PLP1Abijjo0BQlu8tagcNg3OriJjDECiIN0=; b=FGTVstQ4smL4MkS/S1v0snElHCf4WKWXLpeCAlJq/BgeeJ1EzRnWI3AM FC79BPPv/oVtbjjF9CvavZsY6Km00MIaEzAngBKmgBmLe54488jEc1PI4 9NBZTu+E0YxVBeiFa/ILr6C6vBUswyr0uhFJfDf1b8cLFD1xxXtGbxaup Sgt8bvbtMH8O5ifInNxxpoT9NNragqMkn0t/3wuUUFxt/ALeF9Pu07aP5 HxAciZ9N2Pff53yyVx+KK2vYANURraULH+FGHtHceBr4+wMb4Wkuwr8Z4 A4l/SwoaN4iO84MY1c1m5qXFUWxzVfOK41aiz/LKZL5attGenJkR6fJIc A==; X-CSE-ConnectionGUID: Imj/ofuiQXWyvaMrtf7EJA== X-CSE-MsgGUID: 3PSvTR4VRyqjrKpk3tGj/w== X-IronPort-AV: E=McAfee;i="6800,10657,11752"; a="80481683" X-IronPort-AV: E=Sophos;i="6.23,167,1770624000"; d="scan'208";a="80481683" Received: from orviesa001.jf.intel.com ([10.64.159.141]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Apr 2026 03:16:53 -0700 X-CSE-ConnectionGUID: LcZAu2L4QemKBFLTohHlNw== X-CSE-MsgGUID: DouSr19aRCmFFd+R8f1y1A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,167,1770624000"; d="scan'208";a="266426940" Received: from fmsmsx902.amr.corp.intel.com ([10.18.126.91]) by orviesa001.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Apr 2026 03:16:54 -0700 Received: from FMSMSX902.amr.corp.intel.com (10.18.126.91) by fmsmsx902.amr.corp.intel.com (10.18.126.91) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.37; Wed, 8 Apr 2026 03:16:53 -0700 Received: from fmsedg901.ED.cps.intel.com (10.1.192.143) by FMSMSX902.amr.corp.intel.com (10.18.126.91) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.37 via Frontend Transport; Wed, 8 Apr 2026 03:16:53 -0700 Received: from CO1PR03CU002.outbound.protection.outlook.com (52.101.46.46) by edgegateway.intel.com (192.55.55.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.37; Wed, 8 Apr 2026 03:16:52 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=A5CJiAeg63Fumayn1dipMPYk0b9TW6X8HaZ9W8fxRWKxyiR66vpPhWNCILlvG8FpVWIi0L44tqN1kzyAK2JMxJ/wEqNwW73H77Z+lwa5u79b/GCCf7nnioRHR60wWxJpZV2lKvVyPUJK4H1uRd/qUfTjPNhYWjiyCvETjCw1tYRhlmvJK2J8dCtL5cDMdicR2A99sGNWppCKhZGIiS6/v6BoCO4MTzTGteS9KhvwJcYT+be2lvWLNc0wexPvGRYEyLvFLlcIOycYbdvJI9xMrOs1Yn77V4iXPMMyWcaKphNstZ8SMCUOFuycvfBD9e2PlS35deM0dqYIOk7lh+/X8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/RDA4rpQzlnyipKZVw7988n4TDJhlVkFQ8dtzPfJrDo=; b=P7pdYgE/BgUGLOn+UL8wwOR7TCt97WLrt3NHYMKbxRBbFXPrhyPoFBXX/KBLCGHprGsiUPRotpLSiMTk5uwGHNWVBcrFZakUCAZGPt2+mlikEzC/iMOujjAWt3lqebvOw09KddQsrv4ypCfByGJXF7SL6Us67dm80tRsfqlHO7o3AiwMWoES2PeMaqfIBtXvUfA9Ey1IflbFu49VgaJikR182BWKULbcajY2YCr2XgyaiJv/RGowC8+90USvl2qQbSuMvKWgal0wRPsVPqut6vwGDImypPG2pRwxgQThfp+CoHyaqoV+1cTlRKc45Bd0HJnb9JukDIopP0j6iyo33g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; Received: from DM4PR11MB6020.namprd11.prod.outlook.com (2603:10b6:8:61::19) by IA4PR11MB9249.namprd11.prod.outlook.com (2603:10b6:208:55f::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9745.41; Wed, 8 Apr 2026 10:16:51 +0000 Received: from DM4PR11MB6020.namprd11.prod.outlook.com ([fe80::3058:1480:e4ac:5765]) by DM4PR11MB6020.namprd11.prod.outlook.com ([fe80::3058:1480:e4ac:5765%6]) with mapi id 15.20.9769.017; Wed, 8 Apr 2026 10:16:50 +0000 Message-ID: <0920b6d2-73e0-4245-8806-e2cf08f74603@intel.com> Date: Wed, 8 Apr 2026 18:16:44 +0800 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 1/4] sched/rt: Optimize cpupri_vec layout to mitigate cache line contention To: Pan Deng CC: , , , , References: <24c460fb48d86a5b990acbb42d0d29d91dfc427c.1753076363.git.pan.deng@intel.com> Content-Language: en-US From: "Chen, Yu C" In-Reply-To: <24c460fb48d86a5b990acbb42d0d29d91dfc427c.1753076363.git.pan.deng@intel.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-ClientProxiedBy: TPYP295CA0054.TWNP295.PROD.OUTLOOK.COM (2603:1096:7d0:8::6) To DM4PR11MB6020.namprd11.prod.outlook.com (2603:10b6:8:61::19) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM4PR11MB6020:EE_|IA4PR11MB9249:EE_ X-MS-Office365-Filtering-Correlation-Id: b2de762c-6bda-4ceb-a1ef-08de9557f326 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|366016|376014|18002099003|22082099003|56012099003; X-Microsoft-Antispam-Message-Info: ASNY7463BZcAyXg1qDIGbRTOQ9Qug0aTVENSc9EQ9D/kheQ/RNGW0rsRWbUNxYx5mG0c9tk8MhthKQnw2xN3HkVNbIAqVrV1zzQBuWuFI5dfyLgZ80eBVVnelw1JMfz6vNb+4W0sKHOIhSANj0r2MRUs6sUoAWAUHQu2l5FXOU7JWl3aKvcGqtY3JY695DOhqcYQSc0wL1dkoY9sQXNHjliAIsCF38QM/SiY+zMpQWV/DVYCWVH5mqu///WeuIrVkX1ijighaW5fVMngXgU5avyczBNFjNXaxwUPAJ061YBBhsh1xl4nwKLqMUpprAPDU8Raobizf8BgukkH0i/qVFS9ra3UAat4r3bDwBooDQQZx+k82CjLY+HmulG+zfYiYe3oNC31Eez84nQwcibAcT8zTftgV3uwYzFm/8YY5OY3ODk6WvvgE2RRmGC/d1PREpVToeaZMgjMEwVybbVgBhV0+L9JLedD4MMIWV2v6rWiuqixlzibxEJvCJLeHCvnbhcAL/tAeFUDvdXe3kq4EwroA6d+MIpbFpAi11k0k60LHEPxhySSJsFCOvh3/nrjvhtKGaR/rfMt+UEGzFtRyN/z2x6JWSK3wVF2gy4cTcjPpJjDX9Z2yOKmKXmpN+35x7pgrmX/8yR+qjvlW1dQcEIHNogJifkooP+saKPSMD254epa4FaHSwAyeJjIOVZml2uy7U4fOE2TxSzCjB2CIDNaRLk9aDOxKCJd/BTLJOM= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM4PR11MB6020.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(366016)(376014)(18002099003)(22082099003)(56012099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?VDBZcTdEU0xuTEp4aWdqOU1IV2xHYW94L1RTSkVzdVhYT2d1NVFST3k4d1U1?= =?utf-8?B?VlhZTFZMbVIzZ2JpSHFnU0d0SWNMdjRVK0xuTlBnZVRCdjdiSVdHQXhHTnJF?= =?utf-8?B?MlZGNTNMelFXYkJsVVd3S1NBREJkUU5FcnlTNE53a2U1bUVwMlRXRXRxZVR3?= =?utf-8?B?T1ROUHp2bHNUSVI4VTQ4bmo2NWlMaW4zazZEaERyaksrYWVXYi8vRmVSaFZS?= =?utf-8?B?M095a3VSZjgvQWJLdEI5OGJ2OXUxditTR0Y2MEVOSEZKek1qNWhITUtuZUtn?= =?utf-8?B?TGRPaS85SjY2ekJWMTlGTk9MM0RTdk0rOEJZRnpMbnB1ZkwyaitSLys2bUJW?= =?utf-8?B?L1pjOUk0RGNTRU5KT3ByYk5ycEUrV0pBOEJPdUwxL3RKdGlGM3U1SWpWYmE0?= =?utf-8?B?THgwQkt6TkpIM2M3QkVJREJRNHFGWGxTLzB4ZGNXQzNJeVU5cms2K1JZUHFj?= =?utf-8?B?QzFBZWk4SEJGTXFLdytMV2s2QVpKdVlJNmw3MzRNS2ZIS2lPMjEyV1R4UTd4?= =?utf-8?B?SGRXemdpQXRsRHpianM0WWptcGRJQmhUOWVNRXZ4UkoyL1BFbk00b1V3ZC9o?= =?utf-8?B?MlVocWkzNXRmSEZ5bURGZkNWV1FNOEM4Yko5R2Y5VXk2NEpDMlZWWkhxK0VX?= =?utf-8?B?S3A0N0RJWEtma1ZLNUswV1FMWk55QXdaNWZzdFdSclh2ZHRGRENHTTlIMkVz?= =?utf-8?B?Wm4zN3JubWpvSk5JWHUwTjlpK1RFNHRoMnA2Q0o3U2l6UmM0WlNkMzU3NlVJ?= =?utf-8?B?bno1V2pZMUQ0TVdxTGFnT1orZklnYWZZRVNISmtpYkVpZ1RPMzVPQWtmcFFt?= =?utf-8?B?UytCMjJoN1JLb1ZSd2lnbDlaK0ZBSGtsVTdVbTNvMjNjb0FHczVkT3pjbVgx?= =?utf-8?B?SFRyR1ZPWHR6OTlSVUQ3VmhmSEJNU2lwS25wTW1Rbm1lM05PbUhuZU9PcFpN?= =?utf-8?B?RzdFZXZkNmVFUVltRlJ6MTJ2T3REVER6QS9LaWx6Ny9ZdkRHT2FYSFVZWGl4?= =?utf-8?B?Umg3ZGNPR09JVUxWUzcrYzFhMVgwR3Y3WndaZHkvVmExRUFyQTFZTTA1dHFL?= =?utf-8?B?VGlOL1BkSklKd202VThlZy8vRWxMcldKWFRPQm9pems4dktCV1c2b2t0ZzlF?= =?utf-8?B?UDJYUlQxTW1jVDQ5R1E0SWFhT3pOV0ltWi9Ocmt5Z2VUVUpaUnhEOEUrMHJr?= =?utf-8?B?TndOeWpwMXhBMVg0RFFrQ05LSldUbUpPRXJqNk84ZnpEclNlYU92bVFGdVJW?= =?utf-8?B?T3c0MkpqUHFNdTNKTDcwZHZyS3Y0WHQvY0U0clBaZFJLNjEzZXB3ZGVsUmVz?= =?utf-8?B?N3ZJTFNjT1cweG9jSEpBTkFKamErT2I4RVRqUDdRMmJKV0QyYWF2cjFwVEpS?= =?utf-8?B?WHdIVnlqZ1hjeWtsMldZcFVHdjhTalRuVC8xZnVnZTJSSVJFeUVyUGtmYkdB?= =?utf-8?B?MGN2VGQ2bmY3M0o1WG5HRzMwWmhlS05qUzFQU2k4Y1N5emV3UjdvazAyQUJ4?= =?utf-8?B?UlVSVXVOdnNEZ2pBcVRTeENKdnkvQnZXY3ZrRmRHTXo5dWx1TkpkWkllQ2lF?= =?utf-8?B?aHZtbzRlMHp0TEtFUkpqbXlRZ0dUK2puZEhPMXF2bEtCYjFQTmJRa0VhK0FE?= =?utf-8?B?LzZucnEyMEczNnlHMFFiNGZpQWVoTTBWbzd4SUZhSG1PMm53Ykh4WkxORWhJ?= =?utf-8?B?eklOVzJZZlBDdS9YR3VKRmxQTHl0aFZQNWxjL0xFdk95cU9EZERTTERvdXpl?= =?utf-8?B?dXdDRHVJbmw2RmNHVzdhMWpOUGdmSFRGREF1YS9ZYWlFbGVOVk5YcEpCbkgz?= =?utf-8?B?N08rUllsTEdaVkhNRFdGcXhIWHZRSm9EWkQvTDVjNXY5TFNySjlENHJKR1VX?= =?utf-8?B?QVhrZ3NyUWNIa0VOOC9WWi92K2M2dWZHeElRQTgyNVJXR3Jacmp6K3lTb1Bk?= =?utf-8?B?bngxOHlmaERSOEl4ZXg1Sm5JRUlpbVRjQ1ZraDBuc3BGZUcxVDNuZFNydDBH?= =?utf-8?B?SlpkWk01SWM1L1NDdlFUVTFXWmlXQ3RIVitsWkkzOEJMTy9IMm1EZkhjbGUr?= =?utf-8?B?L3B3Y1dUWGF3VFNTQWxhYmwzNVRtZUZ2RlJrTHhLTWxKTmY5bmRGQTJjOG12?= =?utf-8?B?OGZjS3REYndNYjVZdWU5cHNCM2ZFMFdLV1pzeG5ZeHNBMVBLWmdRdU0xblhR?= =?utf-8?B?MjZaQ3hLS0lhUVdyNU83VGFoRDF5dTkyZlJ3UDQzM2JMTDhmbjlKdTBOd3dl?= =?utf-8?B?MW1XWE1yNHN3MCs2Ymp3QWxiT1piTXZJam9BL2ZCUUJhTUxGaUtXUzVhK0k1?= =?utf-8?B?aEY4em4rZDVoc3M2bUhXVE05M3hJRkRCdXZ4cHZpK2g3eC91MlJ2Zz09?= X-Exchange-RoutingPolicyChecked: c2gO1UCzUBBLNLLJqNF9sjVSDECBgNroLIkmBisqsXQz7sHvN6i6MttkX5z5M+WS8/RFuVkUHGpGwsO6LmDar5LZeesjVl3PTrvhNRrQgG1zoVoBerdtukPmJufAKXDRQDWon5uuPbBa6Xtsr4jLNotvq73pxL5q8JNYnqOXURI+L5mUcdoLTytZI5u/XUXUf6+v24RBldU4ofHHESh3PR/U5JZcVMAKJ9KIOTaRCt4pErJMg9ozskmv+VGhddhfCY4kj9wrMYx4opdkhb5fdQOMaLCrWODCtINfsQRgEJBVCCsHBD000Iq6LYk4zjZtLg5S/d6Jf4mOoB4/hvj7sA== X-MS-Exchange-CrossTenant-Network-Message-Id: b2de762c-6bda-4ceb-a1ef-08de9557f326 X-MS-Exchange-CrossTenant-AuthSource: DM4PR11MB6020.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Apr 2026 10:16:50.8461 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: xJ9Sjuhfqwhcg+UX6x9e4/KBgxF0N7HwTwSFThCFHBY2xBEE8/NPrpaBdam7s0n7Q8yfb9TirmzS+hBO1j8gjg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA4PR11MB9249 X-OriginatorOrg: intel.com On 7/21/2025 2:10 PM, Pan Deng wrote: > When running a multi-instance FFmpeg workload on an HCC system, significant > cache line contention is observed around `cpupri_vec->count` and `mask` in > struct root_domain. > > The SUT is a 2-socket machine with 240 physical cores and 480 logical > CPUs. 60 FFmpeg instances are launched, each pinned to 4 physical cores > (8 logical CPUs) for transcoding tasks. Sub-threads use RT priority 99 > with FIFO scheduling. FPS is used as score. > [ ... ] > As a result: > - FPS improves by ~11% > - Kernel cycles% drops from ~20% to ~11% > - `count` and `mask` related cache line contention is mitigated, perf c2c > shows root_domain cache line 3 `cycles per load` drops from ~10K-59K > to ~0.5K-8K, cpupri's last cache line no longer appears in the report. > - stress-ng cyclic benchmark is improved ~31.4%, command: > stress-ng/stress-ng --cyclic $(nproc) --cyclic-policy fifo \ > --timeout 30 --minimize --metrics > - rt-tests/pi_stress is improved ~76.5%, command: > rt-tests/pi_stress -D 30 -g $(($(nproc) / 2)) > According to your test results above, this original proposal seems simple enough. It provides a general benefit, not only for FFmpeg workloads with "unusual" CPU affinity settings, but also for other common workloads that do not use CPU affinity or partitioning. I still prefer this proposal. Later we can rebase patch 4 on top of sbm to see if it brings further improvements. patch 1 and patch 4 could form a patch series IMHO. thanks, Chenyu > diff --git a/kernel/sched/cpupri.h b/kernel/sched/cpupri.h > index d6cba0020064..245b0fa626be 100644 > --- a/kernel/sched/cpupri.h > +++ b/kernel/sched/cpupri.h > @@ -9,7 +9,7 @@ > > struct cpupri_vec { > atomic_t count; > - cpumask_var_t mask; > + cpumask_var_t mask ____cacheline_aligned; > }; > > struct cpupri {