From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3CF19C83F14 for ; Tue, 29 Aug 2023 08:40:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8326B8E0023; Tue, 29 Aug 2023 04:40:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7BB878E001E; Tue, 29 Aug 2023 04:40:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E8438E0023; Tue, 29 Aug 2023 04:40:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 47D048E001E for ; Tue, 29 Aug 2023 04:40:19 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 152BEA04EC for ; Tue, 29 Aug 2023 08:40:19 +0000 (UTC) X-FDA: 81176495358.28.877F574 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.136]) by imf15.hostedemail.com (Postfix) with ESMTP id 24920A0017 for ; Tue, 29 Aug 2023 08:40:14 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=G9UBvvzb; spf=pass (imf15.hostedemail.com: domain of feng.tang@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=feng.tang@intel.com; arc=reject ("signature check failed: fail, {[1] = sig:microsoft.com:reject}"); dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1693298415; a=rsa-sha256; cv=fail; b=xhV2bE3ynnFAM0iXHthF9zIi+z2+lSbR17e2pGQ0DR6BfgUNMMVbXLlCMBmeEdXfy8gQKi lAEHoVNdQnblKd5AQriGJ/sPJn5whRrBDghwnfqNVXYmOVk++yS3y+qYYFxdfNSK5Lk9zH chItHTqKmh8hnz4QeelscJCVGvQZ1wM= ARC-Authentication-Results: i=2; imf15.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=G9UBvvzb; spf=pass (imf15.hostedemail.com: domain of feng.tang@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=feng.tang@intel.com; arc=reject ("signature check failed: fail, {[1] = sig:microsoft.com:reject}"); dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693298415; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zieULVc1qw66jkTYgc88Gr14H2J7WlLgjRf4N/z2R6k=; b=h54k2pSQmG1Ssnh0CNFsq8qazm3nkotS14HWzU4y/osWmT4OHT0CTC0iu7kIe+qYWqx+f8 SGNOmKkVErFc9MUEthHkkqCXxk3e5VeRQauk9beRnaIq43kXQ8nSx/AXTUMSVJ7lzzqp1f 9ivtFH+sMNg+GomWIo0HgOE9t9lb65k= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1693298415; x=1724834415; h=date:from:to:cc:subject:message-id:references: content-transfer-encoding:in-reply-to:mime-version; bh=QUgvNNu5dhtPIONb2t7/fZOiv3ZfWSA0eSiIb0aFYeI=; b=G9UBvvzbfBpdImS8MSg5q2tZKEZva+vpjcI8GdI4tZV8Lw4ycPUqWlrG JsPW+4fgad06dbNzZ2FUhfOwUtfayISfm0PvWEfVbHV9Y20HFvgHQgLzI /xTcpkU+TM4AEVL/fS6maETSs51oGWhatRfaQsHAO6knoyCY+Bx4N7JVz ciHVwk1NSJl4KOvRKyBhGrIpPMdyr0Swk+AScWTxLnCLxrroNAdFCVP0y 8yxqkVBP6YTFX/2eI+uiw11CxTErykzQ8Ja6QlQVk/uKFjKOyUkQz0cZa YxOSQYm4aleqsk6kD91r2dyqLmoa7EnvBqB/YZyJFeh4aP8SStE54uK/B w==; X-IronPort-AV: E=McAfee;i="6600,9927,10816"; a="354826966" X-IronPort-AV: E=Sophos;i="6.02,210,1688454000"; d="scan'208";a="354826966" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 01:40:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10816"; a="738619724" X-IronPort-AV: E=Sophos;i="6.02,210,1688454000"; d="scan'208";a="738619724" Received: from orsmsx603.amr.corp.intel.com ([10.22.229.16]) by orsmga002.jf.intel.com with ESMTP; 29 Aug 2023 01:40:13 -0700 Received: from orsmsx610.amr.corp.intel.com (10.22.229.23) by ORSMSX603.amr.corp.intel.com (10.22.229.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Tue, 29 Aug 2023 01:40:12 -0700 Received: from ORSEDG602.ED.cps.intel.com (10.7.248.7) by orsmsx610.amr.corp.intel.com (10.22.229.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27 via Frontend Transport; Tue, 29 Aug 2023 01:40:12 -0700 Received: from NAM11-BN8-obe.outbound.protection.outlook.com (104.47.58.169) by edgegateway.intel.com (134.134.137.103) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.27; Tue, 29 Aug 2023 01:40:11 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LTWEjYfja3pAnvDwCEF3B8rMNPRMgFFKXj7lGbPMtH7LGb5XD1WYluGUzuyo24OrdYXscO/NszUJ7eotsmyIzH68OntoPmZDI0Fg56tIkZnks3HPgXT7eAGTUhBrBNXvOyNkLR89sYm0GR0J+la0GcqHjVVDPHWmPliDwcJ67we4eIs4Bmq2y8q6vifvMcoMbYAXPddUd2nYOuwx7DgMjpKFvP9O5lpZzBsodWzi+L0bMX0ZOk1X7vX+53/Q//KfEdPMMJgpozWKFzIMt7Y4ccHraRKnyWsCaP3hUB5l5845w2UrJKghUwt73kWGK4xBcDZG2S0Xlc0Y9rIJ+14JhQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zieULVc1qw66jkTYgc88Gr14H2J7WlLgjRf4N/z2R6k=; b=lC7Hj6j4gyL1Lk68e7jXOfqE6Hl55Q4hTB0Vpt+aosvO5gKPuko8jtkZFX1SqH46aKWuJ8zrHt3nkUB1IHqxShKyakPh+2IT6N9DlBDne1ViZ8v6uEJ3E9Q5F7b7vjKl4nzeCmgH+eFK1vdTW3lV2mmFTyry4PdxfAKVXgNiApeOwJRl0KgfJAs7fqtJVC8ZTWvMHF/rdfYo2JmDfkC/1J0iBEAESRaH7QjhgUpPe7WDjpFpxCB4r1frpfutKhNNIv8j0055ecK5Y8z8igpXAQTW4xq7zlZB+xWkaIZsfGQxvMGpKyZ7jOAP8QP446L2bDaxTaP1UajGmEPQI3Y+8A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from MN0PR11MB6304.namprd11.prod.outlook.com (2603:10b6:208:3c0::7) by PH8PR11MB7021.namprd11.prod.outlook.com (2603:10b6:510:223::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6699.34; Tue, 29 Aug 2023 08:40:03 +0000 Received: from MN0PR11MB6304.namprd11.prod.outlook.com ([fe80::57e7:80ff:c440:c53a]) by MN0PR11MB6304.namprd11.prod.outlook.com ([fe80::57e7:80ff:c440:c53a%5]) with mapi id 15.20.6699.034; Tue, 29 Aug 2023 08:40:03 +0000 Date: Tue, 29 Aug 2023 16:30:17 +0800 From: Feng Tang To: Hyeonggon Yoo <42.hyeyoo@gmail.com>, Vlastimil Babka , David Rientjes CC: , "Sang, Oliver" , Jay Patel , "oe-lkp@lists.linux.dev" , lkp , "linux-mm@kvack.org" , "Huang, Ying" , "Yin, Fengwei" , "cl@linux.com" , "penberg@kernel.org" , "iamjoonsoo.kim@lge.com" , "akpm@linux-foundation.org" , "aneesh.kumar@linux.ibm.com" , "tsahu@linux.ibm.com" , "piyushs@linux.ibm.com" Subject: Re: [PATCH] [RFC PATCH v2]mm/slub: Optimize slub memory usage Message-ID: References: <20230628095740.589893-1-jaypatel@linux.ibm.com> <202307172140.3b34825a-oliver.sang@intel.com> Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-ClientProxiedBy: KL1PR0401CA0017.apcprd04.prod.outlook.com (2603:1096:820:f::22) To MN0PR11MB6304.namprd11.prod.outlook.com (2603:10b6:208:3c0::7) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MN0PR11MB6304:EE_|PH8PR11MB7021:EE_ X-MS-Office365-Filtering-Correlation-Id: 3a457bbd-5ca9-4b4e-bad8-08dba86b897b X-LD-Processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: pzeV9wzP+pAaAFADqbre3jDMkj+btvXoSsusEcGkrUJHg01QTFAoWI4HuFo3VbsE3HvaxiwgVILKkYCoNkR2xYcBC/DHq60WuRzDR7zw+6Eae5sHYheNtdbG7Ge35Fo1IJy5bPlx5EtOeZww62RyPpw1CqZq7GVdsalY5I0/KZW84SAgvIlTWtegiXSlWuKch+bmKNYFvzzGldwAizSweW5ScVRDz0EGCTo1wxSsDZyTVx7pO/jGWIOomOaRQVOSVk9z402JRnHOnAsWh6RG7EkBtZdaChxsrh7HxyOAsJL7WsL2Gk6JweGmDdrNV8/n0s7ZzazqXgJHgChrVF+u6sa2J7ooQ32JthmFMrDkzZ/Lq+mGOwsuL2ocHCxv5DIpdAk4lGsmI3CRX7zdaaXzVts3zerdjVxxdQPa07gVPpjwBr7NfKTDBO+0KAArM6z5C0uwh0r4GSX6AuBhPAOSSEGkneZ2rgL2ZXDAlkvssiPP9C1Y7HN+lJNzvf6YV3a0MSDfiigeVl+HIk30BkasGkC2S0FTXGxmhlV/7rvQZr4= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:MN0PR11MB6304.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(7916004)(376002)(396003)(136003)(366004)(346002)(39860400002)(1800799009)(186009)(451199024)(83380400001)(478600001)(966005)(26005)(6486002)(9686003)(6512007)(6506007)(6666004)(86362001)(5660300002)(2906002)(82960400001)(54906003)(8936002)(38100700002)(7416002)(33716001)(66556008)(316002)(8676002)(66946007)(4326008)(66476007)(110136005)(41300700001)(44832011);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?VGU5eHRMRk8vVWpUNlloTEdWeEtxTnZ3Wkp3REUzTUZTcXVQNDQ1aVV4R3cr?= =?utf-8?B?dzVRS1ZwVGxVT1F0YUNtVTdncHF5YzdJbi9xRlRMZFdkbmJJMy95cEROOE8z?= =?utf-8?B?czE4aDhid0ljTDROdnlVVUVBR1p2MWUyOFJWTFZlWTVSaWhaZE5pelZEZjVM?= =?utf-8?B?eE5Fd3VGVlJaeEliekV1K09lVnZOSVZjVGVVUXBqQTNPRTlsTWh3eVNCVjQ1?= =?utf-8?B?R2VvZEFBM3Q3N0ZaZmVNOUF1T1Jub250WVd6emxJUlpJbTVwWEp1NEkzQTM2?= =?utf-8?B?eUZBeEdyb3k5WVZvcEd0Qkp4RzJPS1lwek1Bbnp5Rk1VT25vdFpub3NWcE5i?= =?utf-8?B?VC94MkwxaFN0YUpSSVFLWEQvdXhLWlJ1elhEeHdYOUFYY2JGOWtsS29oMGw1?= =?utf-8?B?aWRvV0RodUNiUVF5d2p6YTJObzRVU0JBcFh3SzBwNDRYMlJNeW5ZOGJsRThZ?= =?utf-8?B?U21pdFYzams3dW5JczBGTGRTaThVSEpOdnEyUU93cU1kNkRtNVZ4eldWUW1x?= =?utf-8?B?aFlRZ0hWZkF0Nk55bCtNSVZpVjZxV0tKbVU3SThYa1FHbEQ5djBqU29NY015?= =?utf-8?B?UHE4VHc3bVdkSXJMbDljdE44WVloTDBqWDFUNk1ERTlqeldEWUFlSW1oVmI5?= =?utf-8?B?M3BNSDJnWFdoaCtieTZMYkV0NXdMS3BkQUVmK2YwMklidlFnVjRmNzNKMDdn?= =?utf-8?B?WUx4VXpBQW85cTNFZjliY1hjRzMweDBVRWdtaEJ4dEt4b01nUWN5KzNBV3pY?= =?utf-8?B?MDdsdG9wWko4eVpjU2N6UWlUekQ0bFd3ZWhsS1hMNFVaa0F1QTU1em4wMWFZ?= =?utf-8?B?TzgxdlR3RzdOdVQrS0EzUDNwbTZhcXh1b1crd2h0SWpvVDdpL0ZHeHlqQUJK?= =?utf-8?B?dHI4dWljWmJPWmRtcFRlaE5EZk5rRjh1MzVUVnhOZUpqVkRIRVRhdXYvQjFM?= =?utf-8?B?ZGl4YVlFSG0yKzVoMEU1TnZSdWVTcXVKRGhHeXRHUUowcGtCdHVVd3lDV0RD?= =?utf-8?B?R3JzajY3dEEwUFAwVHp4RDY5cFBnT2xSdXYvMjBtaThGcE5xMUpIRjQ0MS8v?= =?utf-8?B?UGlPNmo2RzdQVE1iUTMzY1ZnNGtyemkrUWZnVjdGeUZLbVl1MGg4Rk12Nitw?= =?utf-8?B?R0o0eTlIS2JGWFZYZkFmdVU3a0N1V0VFeHMyZ09RdENtemdhamRLUnUwRVhu?= =?utf-8?B?NDNlZDFxSFUrdVZMcGZtU2EzSHdXbndMeENPK1FkMjZoT3BYR3U5Sjl3elo5?= =?utf-8?B?U01EbmtMWEJtSHNDeG1LYTB3MWg4OEZua2tnU1lqOGRiamdUM3l4VWtidG5P?= =?utf-8?B?aXVvNW5tSWdDNnZHWjQ1Y1FuTmpqRWovRDRMN2d1UUZYTkhUNWRSbFdwd1V0?= =?utf-8?B?ZldteGNSYkZyUHpHV3JRRWlmd0hESmlLYkVocTc2b01nd2dubHc1cHF2d25W?= =?utf-8?B?WkJkVTAxVkhBcStBQVkzZDN5N291WUhPc0g1eFhWMS9jNUx0OGMrTWw5T3RG?= =?utf-8?B?czJWN1RPbW1vZW10ZnJDTnczMWVsVE5CbWs5QmZiczZmY29SSlpUb3hIR1Bm?= =?utf-8?B?VVFDRnloa1QzTVE2eUVwR3piSDdjdi8vaTV0Y0Z6Z1QvVU8vdkRSZ0VtOFB2?= =?utf-8?B?ZXlqMXJQN0tkZGkxZVJpRGJSL3JMSEVoUWowMzVkUkRpK2JtWHFkN2FTMDha?= =?utf-8?B?ZGM3RlJMYjBxU1FOQTVYQXgvWXZjT0ZoMGdFazFCZG5KSXJaMXV3dUJQNkVs?= =?utf-8?B?S3U5OHFyc3JmS2lGUHpmSjNPNm42VXJXVjNDQzk4SlJvbWY0RTNjTmdVeG5Z?= =?utf-8?B?emRRaGlFRjdzKzhGYm9kSXhWa25mdnhYcTRqclhTM3hPZmRiMXFIUTVUZmM2?= =?utf-8?B?eUpBVE9hdTVXbDZLY2srOTI2NEsrbHh2UnM2T3c4VnFFNWtCVmsyY3p1N0lL?= =?utf-8?B?bENCNjkxRmtqSUlsVjJQUWVxekIvUDliTWh2OGVZcHRCZDVaSks4VW5BcXV1?= =?utf-8?B?cmU2UzRMK2Q5NEt0U0w2OHg4LzFpakpib0ZTOGYwZ0J0ZFdnejh0dmNEMkVH?= =?utf-8?B?YjAvdkFsbFF0U09meWtJVVlieVhscjlIR2pCMFc3OXhnZDBBS2JCVkFmM0Vi?= =?utf-8?Q?UCgkOP1WvoW5Q5CKBkx6cRKol?= X-MS-Exchange-CrossTenant-Network-Message-Id: 3a457bbd-5ca9-4b4e-bad8-08dba86b897b X-MS-Exchange-CrossTenant-AuthSource: MN0PR11MB6304.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Aug 2023 08:40:02.6986 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: k6wt48IgX3ec89kLYc+z794Of2VmdzyWWWZ9Q1vYu7OU6H7wVF2R1cdP2xmp8+xY4DV3dA6DmU5PUPFbOxh+Iw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH8PR11MB7021 X-OriginatorOrg: intel.com X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 24920A0017 X-Stat-Signature: b8jgcaxuz63bpzrkpawxbeqfsqsuphiy X-Rspam-User: X-HE-Tag: 1693298414-122738 X-HE-Meta: U2FsdGVkX1/R8Wji341NjLqdkVOAjb6Bzj/vVg2MYfnIgCS8lVqwdbpy3JpXThf8ULUfM9LYUqQtCKNSyNEym+iN5hV4beZqkZF6oypYG2b9YVih4D5CmbcejyBjftbq+DB4IOaFghppV1PuS0nOgtOQD7Tv0pbu9fyXhjH6vlBs+sv5WChTkpuYH/zmnZ2uQJUE/hLEVPul191a/ppiCqNz4B5/r9ppddnzU/Im5hEkXnSYKrzvAkpres6J8LcPOZqMKdGc38+2p+5JSpUtmdUk9eviC9izpitWO1Ix02NcGbxklhhTqDtzSnt/vSuR3nJ8T1aiK9WEEzilkVLyljr8NiGUjoGdz9z6utUtan9embxLTZn/YTdRR7rkGw1f0LE2dqhwtnU9GSOOZB3wwJWcFdv2F+6rRUnpIXLnt7G9SP6FbWzPxuacWppiCQDhmw7wGi2KJRZ5VL4Cg85T1jGvDbxguDPJlPfsaNZSRUur5N0Ck6iaNLdcycWJ0TNaXZcVZJ9SPCT+JGusjKdUJFmGqUePvpJbKm4ji7PDzocigvlbZzQQISjML7tKlMPUDrfCHERyifUQa6O7RbPGgtBbeQrKPceMS/fnmE21v4tbCKRWttnE2kyyId6FR8YxNedbUx84sdEPVj0CYqnqbziT30iL2a4LHJMtnOFP5fqIjfSCTVHanCYtCLGOz7Jloj//CmA/A8SkJ8UNldDMBgEdgu55/sdwVTR82RF1qtuKrSiLcbvM41kux2erbA34sK7hmGPyN1DVjWbqo5618FZxaCon71SUZom9eUxiqXgL4ABtXEHkXkC8s5gljN3D9sySSsNIm12GMA7XvkbKjJ92vOTfHGRtsmeAIBByofxVQse+SsXEYmvc0+VqFrRchsGK+RORSyNWggvgoChqbr5ARJCB3tvln2+Lgaz3/xHx6sNB+W4J45EEIG+DkhQ0grWIqFP1LZAYFbh8mqA +ENqouED UopewIG6NPM2qwKc1auhjxpai3hKp5HdH3HXCYRQXDVkaCHNFfqUDoHjSzgVQZKHgtKB5FlXQ+N0TMahqoZd/5S+YpTt2WkUVMfY3eTES3uGhQZGkB+kEeKyysrqE8yMzuLcXkTqRMimFV+PYr/+OGx2gWH9MUlb2ru8zgq1toepGPDNGmSKYDEORmGgXKTQh/ktSyTOg76ezOnm1ScvPLVyn/iFpgMOCGRlIiWasLeI2GefBTLqs4EyK2m9NhU9PrnKVyuQ0QWTTKUn/IfpZOLw1WPjNEVwQCXaXL4TndtlIIiN2nW1lY1gbr4KMitKqVwrWdYQ5q2xJ7U/dhOOjl+09EcLviEx0SEkX1sB+1f3X604bLsEhs6UUFfGIkMwWBA25EyUKxdRYmvSLRN4IGSG+LaKsP+JTZqjg5vncCp4kdEmfSRorfZpGlKLNOfeiRFu1p4cN2M4EMi4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jul 25, 2023 at 05:20:01PM +0800, Tang, Feng wrote: > On Tue, Jul 25, 2023 at 12:13:56PM +0900, Hyeonggon Yoo wrote: > [...] > > > > > > I run the reproduce command in a local 2-socket box: > > > > > > "/usr/bin/hackbench" "-g" "128" "-f" "20" "--process" "-l" "30000" "-s" "100" > > > > > > And found 2 kmem_cache has been boost: 'kmalloc-cg-512' and > > > 'skbuff_head_cache'. Only order of 'kmalloc-cg-512' was reduced > > > from 3 to 2 with the patch, while its 'cpu_partial_slabs' was bumped > > > from 2 to 4. The setting of 'skbuff_head_cache' was kept unchanged. > > > > > > And this compiled with the perf-profile info from 0Day's report, that the > > > 'list_lock' contention is increased with the patch: > > > > > > 13.71% 13.70% [kernel.kallsyms] [k] native_queued_spin_lock_slowpath - - > > > 5.80% native_queued_spin_lock_slowpath;_raw_spin_lock_irqsave;__unfreeze_partials;skb_release_data;consume_skb;unix_stream_read_generic;unix_stream_recvmsg;sock_recvmsg;sock_read_iter;vfs_read;ksys_read;do_syscall_64;entry_SYSCALL_64_after_hwframe;__libc_read > > > 5.56% native_queued_spin_lock_slowpath;_raw_spin_lock_irqsave;get_partial_node.part.0;___slab_alloc.constprop.0;__kmem_cache_alloc_node;__kmalloc_node_track_caller;kmalloc_reserve;__alloc_skb;alloc_skb_with_frags;sock_alloc_send_pskb;unix_stream_sendmsg;sock_write_iter;vfs_write;ksys_write;do_syscall_64;entry_SYSCALL_64_after_hwframe;__libc_write > > > > Oh... neither of the assumptions were not true. > > AFAICS it's a case of decreasing slab order increases lock contention, > > > > The number of cached objects per CPU is mostly the same (not exactly same, > > because the cpu slab is not accounted for), > > Yes, this makes sense! > > > but only increases the > > number of slabs > > to process while taking slabs (get_partial_node()), and flushing the current > > cpu partial list. (put_cpu_partial() -> __unfreeze_partials()) > > > > Can we do better in this situation? improve __unfreeze_partials()? > > We can check that, IMHO, current MIN_PARTIAL and MAX_PARTIAL are too > small as a global parameter, especially for server platforms with > hundreds of GB or TBs memory. > > As for 'list_lock', I'm thinking of bumping the number of per-cpu > objects in set_cpu_partial(), at least give user an option to do > that for sever platforms with huge mount of memory. Will do some test > around it, and let 0Day's peformance testing framework monitor > for any regression. Before this performance regression of 'hackbench', I've noticed other cases where the per-node 'list-lock' is contended. With one processor (socket/node) can have more and more CPUs (100+ or 200+), the scalability problem could be much worse. So we may need to tackle it soon or later, and surely we may need to separate the handling for large platforms which suffer from scalability issue and small platforms who care more about memory footprint. For solving the scalability issue for large systems with big number of CPU and memory, I tried 3 hacky patches for quick measurement: 1) increase the MIN_PARTIAL and MAX_PARTIAL to let each node have more (64) partial slabs in maxim 2) increase the order of each slab (including changing the max slub order to 4) 3) increase number of per-cpu partial slabs These patches are mostly independent over each other. And run will-it-scale benchmark's 'mmap1' test case on a 2 socket Sapphire Rapids server (112 cores, 224 threads) with 256 GB DRAM, run 3 configurations with parallel test threads of 25%, 50% and 100% of number of CPUs, and the data is (base is vanilla v6.5 kernel): base base + patch-1 base + patch-1,2 base + patch-1,2,3 config-25% 223670 -0.0% 223641 +24.2% 277734 +37.7% 307991 per_process_ops config-50% 186172 +12.9% 210108 +42.4% 265028 +59.8% 297495 per_process_ops config-100% 89289 +11.3% 99363 +47.4% 131571 +78.1% 158991 per_process_ops And from perf-profile data, the spinlock contention has been greatly reduced: 43.65 -5.8 37.81 -25.9 17.78 -34.4 9.24 self.native_queued_spin_lock_slowpath Some more perf backtrace stack changes are: 50.86 -4.7 46.16 -9.2 41.65 -16.3 34.57 bt.mmap_region.do_mmap.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe 52.99 -4.4 48.55 -8.1 44.93 -14.6 38.35 bt.do_mmap.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap 53.79 -4.4 49.44 -7.6 46.17 -14.0 39.75 bt.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap 54.11 -4.3 49.78 -7.5 46.65 -13.8 40.33 bt.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap 54.21 -4.3 49.89 -7.4 46.81 -13.7 40.50 bt.entry_SYSCALL_64_after_hwframe.__mmap 55.21 -4.2 51.00 -6.8 48.40 -13.0 42.23 bt.__mmap 19.59 -4.1 15.44 -10.3 9.30 -12.6 7.00 bt.___slab_alloc.__kmem_cache_alloc_bulk.kmem_cache_alloc_bulk.mas_alloc_nodes.mas_preallocate 20.25 -4.1 16.16 -9.8 10.40 -12.1 8.15 bt.__kmem_cache_alloc_bulk.kmem_cache_alloc_bulk.mas_alloc_nodes.mas_preallocate.mmap_region 20.52 -4.1 16.46 -9.7 10.80 -11.9 8.60 bt.kmem_cache_alloc_bulk.mas_alloc_nodes.mas_preallocate.mmap_region.do_mmap 21.27 -4.0 17.25 -9.4 11.87 -11.4 9.83 bt.mas_alloc_nodes.mas_preallocate.mmap_region.do_mmap.vm_mmap_pgoff 21.34 -4.0 17.33 -9.4 11.97 -11.4 9.95 bt.mas_preallocate.mmap_region.do_mmap.vm_mmap_pgoff.do_syscall_64 2.60 -2.6 0.00 -2.6 0.00 -2.6 0.00 bt.get_partial_node.get_any_partial.___slab_alloc.__kmem_cache_alloc_bulk.kmem_cache_alloc_bulk 2.77 -2.4 0.35 ± 70% -2.8 0.00 -2.8 0.00 bt.get_any_partial.___slab_alloc.__kmem_cache_alloc_bulk.kmem_cache_alloc_bulk.mas_alloc_nodes Yu Chen also saw the similar slub lock contention in a scheduler related 'hackbench' test, with these debug patches, the contention was also reduced, https://lore.kernel.org/lkml/ZORaUsd+So+tnyMV@chenyu5-mobl2/ I'll think about how to only apply the changes to big systems and post them as RFC patches. Thanks, Feng