From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,MSGID_FROM_MTA_HEADER,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3769C433B4 for ; Thu, 22 Apr 2021 01:10:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3855961424 for ; Thu, 22 Apr 2021 01:10:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3855961424 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=fb.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B56388E0001; Wed, 21 Apr 2021 21:10:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B06466B006E; Wed, 21 Apr 2021 21:10:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 945B88E0001; Wed, 21 Apr 2021 21:10:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0078.hostedemail.com [216.40.44.78]) by kanga.kvack.org (Postfix) with ESMTP id 73EB76B006C for ; Wed, 21 Apr 2021 21:10:10 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 31A1018078FE2 for ; Thu, 22 Apr 2021 01:10:10 +0000 (UTC) X-FDA: 78058221780.17.CC182F4 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by imf16.hostedemail.com (Postfix) with ESMTP id 8ECD280192E9 for ; Thu, 22 Apr 2021 01:10:08 +0000 (UTC) Received: from pps.filterd (m0001303.ppops.net [127.0.0.1]) by m0001303.ppops.net (8.16.0.43/8.16.0.43) with SMTP id 13M16vnx023921; Wed, 21 Apr 2021 18:10:07 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=date : from : to : cc : subject : message-id : references : content-type : in-reply-to : mime-version; s=facebook; bh=1mcL41GzorTvctRWSul7h5KZgVo+a2RNw/auuLmK228=; b=l8jpyudhOqlHfBE5eo67/Ab++yARFBfqWQy4m2Tm5qUunHSesh+AaviBCzEmSpef0Y+b G26ORA3HlcViql/224EjOmf7bS/b8vu0F8CiaStrdbZh2oongrTEfsSmxkfdh/OnIySz M1CELvrqEKM7E5nLiCNDvrczSPXxQASQCXk= Received: from maileast.thefacebook.com ([163.114.130.16]) by m0001303.ppops.net with ESMTP id 38270e04mp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT); Wed, 21 Apr 2021 18:10:07 -0700 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (100.104.31.183) by o365-in.thefacebook.com (100.104.35.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 21 Apr 2021 18:10:06 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LHYrmh/He0J6eIhWmqX7RBiJmKL4iO2xMd6Pgl4DfjcsKG5wyiaQRVw6XyeQCvoG0WBxlA8FSl8BFnsBioLEY7JCj3SosXIRrGkuatro3bVPmZ4C+5ikwniDiBka7aE4sBFhYZXon3K4rRX9nb3r9HAPFjNfFD7WnRj3nKJx3oNM9y3KgcxqeDBjy3GpOs3wqS4LaNmdubnqiYw/nTLs91CKUkIzaXktWWz47AQk1il6OcGbKUYEmUCFQb/9Fh1F9+TdaemltEgvCGNWUp7p057eAoLYY9L1KbAGs31CY7PI7W+sgZdQNw0pKIPjxGaYKJUMwI2YeEekkZT87XJOJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1mcL41GzorTvctRWSul7h5KZgVo+a2RNw/auuLmK228=; b=GVhmtY9yPUejrG8/1zTgULTg+oCLfYtIPbgz+/OT4mwrJt1cMUYLgaPqzIp3QvNjpXDqpFNi70SUbtM+N9FJM1n8V8/BKi848sSyxGqRqw5a/1pxBbxtnjHRRxl4Z1Nwh0/jKPUX5SgJemt/KQCQLoD1t9GI47jAdCle6XtrE5P2CD3vsHbxuZBH18mG4y3C1lOeNHdoCNWQOecGA4Thmlc7jfG/+531isVE///ZIGBP24+Gw28DYaVSpuXp55XG7cSVWyj5/y3PDhSygFiKDEMvML63cynqosC/Olej+XpOMOBh0ZlE8C8+V6xTCt2ohCAwcGOYhRWN5VwucaPMew== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=fb.com; dmarc=pass action=none header.from=fb.com; dkim=pass header.d=fb.com; arc=none Authentication-Results: vmware.com; dkim=none (message not signed) header.d=none;vmware.com; dmarc=none action=none header.from=fb.com; Received: from BYAPR15MB4136.namprd15.prod.outlook.com (2603:10b6:a03:96::24) by BYAPR15MB2807.namprd15.prod.outlook.com (2603:10b6:a03:15a::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4042.19; Thu, 22 Apr 2021 01:10:04 +0000 Received: from BYAPR15MB4136.namprd15.prod.outlook.com ([fe80::dd03:6ead:be0f:eca0]) by BYAPR15MB4136.namprd15.prod.outlook.com ([fe80::dd03:6ead:be0f:eca0%5]) with mapi id 15.20.4042.024; Thu, 22 Apr 2021 01:10:04 +0000 Date: Wed, 21 Apr 2021 18:10:00 -0700 From: Roman Gushchin To: Alexey Makhalov CC: "linux-mm@kvack.org" , Dennis Zhou , Tejun Heo , Christoph Lameter Subject: Re: Percpu allocator: CPU hotplug support Message-ID: References: <8E7F3D98-CB68-4418-8E0E-7287E8273DA9@vmware.com> Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <8E7F3D98-CB68-4418-8E0E-7287E8273DA9@vmware.com> X-Originating-IP: [2620:10d:c090:400::5:753e] X-ClientProxiedBy: MWHPR22CA0037.namprd22.prod.outlook.com (2603:10b6:300:69::23) To BYAPR15MB4136.namprd15.prod.outlook.com (2603:10b6:a03:96::24) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from carbon.dhcp.thefacebook.com (2620:10d:c090:400::5:753e) by MWHPR22CA0037.namprd22.prod.outlook.com (2603:10b6:300:69::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4065.21 via Frontend Transport; Thu, 22 Apr 2021 01:10:03 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: c3d67bd0-6206-4789-0709-08d9052b5c8d X-MS-TrafficTypeDiagnostic: BYAPR15MB2807: X-Microsoft-Antispam-PRVS: X-FB-Source: Internal X-MS-Oob-TLC-OOBClassifiers: OLM:8882; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: NpolB9XznT+ZgNMPuEojYBETfjPzA0C+jLwVt/1Aqy5TTRG0a6zm9RUyOQPf/24S9ndYA1sRrsHqQT+iXiBG1/iwDytGvEccZJyW+aOdXFfgx5yo5e3FalYfXlZmWLlK0BXmOCMa5siPVXtVkv+cShR/74K0eBIVnxOeLC/N4PmF6+/TlIv+lUGWZ407r+H6SWXYyKjIdGoqhRjOdG2VlTPBPKEaKlCesi9aNgWMrbelyjn9Zfrh6kdh+cg0HI9Cd+nOHMlf8EhRdVpzjMU8RWmzPVEV/Z1vI1pIllARNdsJIUEeAA/bze2JxubvBg/K0/GROrfkh8VGog/CXKyKIT3a8DI7ZXsP67Be+hTjs9WkJpXl6FLZUhE90Qfq6IsgVmA06c205GeZ7YyYt/u+7NJVgDAA6Uk/H+ht0sAyrauNbyacphJKG+F6glFGH9yKBYlOpf8PQ4SSuI8QuWHYoIsM8LYUUE78GZ0xbLFB9rtsf44rCqZCAmTMNCvBWcUuPexGS8sv644wPkDyguT8mCDiev8jtf70qZpiBeNyH4uUIbwi4ciMlzcA5hTHcbkhiz0gcVs+R2WGytodqu0Jq8WKd2I1vjwgSh/jEqDTfew= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BYAPR15MB4136.namprd15.prod.outlook.com;PTR:;CAT:NONE;SFS:(376002)(396003)(346002)(136003)(39860400002)(366004)(52116002)(38100700002)(8936002)(54906003)(8676002)(2906002)(4326008)(7696005)(9686003)(16526019)(66946007)(83380400001)(6506007)(186003)(86362001)(5660300002)(478600001)(66476007)(316002)(66556008)(55016002)(6916009);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData: =?us-ascii?Q?26pmHl3e3WAuUC0Mintwvp1XKW+Te8CtMbaFVGsUmhGkMuhJHExmlrQlvGqj?= =?us-ascii?Q?EjrL884/DJC+m2OX74NbBLwPDbEyCs0AB9geEySNkx6/hJlkWmBgFlNkPauk?= =?us-ascii?Q?ZTLSJeEG3+WIH0Egr2zBW37vaswQtoIT5p7XCljkErJwnluya5TGTUVJxuxI?= =?us-ascii?Q?EwQICYIcqq8r4Pz50yBKhb3ofZL1uIjDcX91Flzn9cCPkMhjHvsp3zSRzkHX?= =?us-ascii?Q?BAGm7GO2tCWL1rzBYHxMtmABsEzPCJ9Mn3TkqB0Qu6RV0o0S9HOP7fD6f6QN?= =?us-ascii?Q?u3Ti7BqSVBZBResEqtcNuMXRVd2Nl9Y78OU4BKZG0Uz0j8EuBpOYKablCv62?= =?us-ascii?Q?8Q0TmT55d9xZoffubk8Vpy7hU/fCcFUNNt2lFSrKs51lo8Z6CLLX0e38unEl?= =?us-ascii?Q?MFuKar1DVw9baqP+ykk09holf13RIhHoW0Exf/L+KSpSMeg5IsNw7sxyBzFt?= =?us-ascii?Q?m5j9/OoxZTRt0FzjDwwvgkuDWoDb1WOsic30oWe8EMhzwZB/GQL0671LPw5L?= =?us-ascii?Q?69OBt8xlKpLG5oWhBHyg/cEA5g/4rhEoKVzU0Mtw7HGwD3MROxtbVnt/gKNN?= =?us-ascii?Q?STN+6IJJA5gUmDRugRiWssFPkvZ7MGHiktrkKTXeIru5K62wwLrMhCHunbQr?= =?us-ascii?Q?nveYypNbUgPUtfF8I3F/N2X+zU7nsrRKH7o57zTPd3M+DRWB9w4nab9cLmXI?= =?us-ascii?Q?NdesBT9QmNThc1GPG+GX4Jrz6yxgcg5yU8uEvGRndXihAEjV+YRXQqVA8sHR?= =?us-ascii?Q?UyJZ92bFhYK6MyK9gNESLjghnfF8A8cX2FmNhK8rootcvn94li3HAfuTOyUP?= =?us-ascii?Q?PIE6S/IF1dlz6yMIsrlktpRTR+uMNnCxG+jXpnx98dC1/qDiXYGSYq+hZNFc?= =?us-ascii?Q?NOYOz5d6Gu4GeafZhP8m82F+FmhJcDfn7v1AiUALR+bCiUDTRA/3adrhA23Q?= =?us-ascii?Q?85VgRKM9l1QL3pg7CeOEnIqqt4iIXtEa8+1sbzaQ9Mi4DGGXIUj3DX1nST86?= =?us-ascii?Q?v8rxlYy5crbt/Rc4J/maxggQ1GXFbBAGTXs7LXgNN5QMQ7UsUn1hvEm17vEa?= =?us-ascii?Q?5GJyQTnA6uBmNUBrm7wR5K/A9YJXd3nTcoz79HZKjPZWOqFhaP8nmzPLra5J?= =?us-ascii?Q?PSOIzkaCkFnKxXTRUVKBqDqee1bD66JZlpe20gXS2Ibb1HbipycbyY+fnFnw?= =?us-ascii?Q?RFX9MXAEiX0YVN1v7uX7QcI1zd+MuDiDeL22i7kB+Y0GJUnWJyl5XrHtRchz?= =?us-ascii?Q?gf34fwhTC1Noi/tl1X92MSsIg/yjCM8ve69+6zCd45E4wg6fudwP+RT0slEx?= =?us-ascii?Q?CpgQQnRCNRYb47fUu4vExqaoQlLWFD0akFNgwhGuvb0xYg=3D=3D?= X-MS-Exchange-CrossTenant-Network-Message-Id: c3d67bd0-6206-4789-0709-08d9052b5c8d X-MS-Exchange-CrossTenant-AuthSource: BYAPR15MB4136.namprd15.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Apr 2021 01:10:04.5044 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ae927fe-1255-47a7-a2af-5f3a069daaa2 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Wlkyx+Fu8vLRbPqZ3aOqu2x9Jz4N0RgqjyOezQLdqUNPhdyjFsM7qGdHmgpmtZ0s X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR15MB2807 X-OriginatorOrg: fb.com X-Proofpoint-ORIG-GUID: Ehz4J9y1YGQxUMJVR-q0vf0t47ZNXxfD X-Proofpoint-GUID: Ehz4J9y1YGQxUMJVR-q0vf0t47ZNXxfD X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.761 definitions=2021-04-21_08:2021-04-21,2021-04-21 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 priorityscore=1501 lowpriorityscore=0 bulkscore=0 spamscore=0 impostorscore=0 mlxlogscore=999 phishscore=0 suspectscore=0 malwarescore=0 mlxscore=0 adultscore=0 clxscore=1011 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104060000 definitions=main-2104220009 X-FB-Internal: deliver X-Rspamd-Queue-Id: 8ECD280192E9 X-Stat-Signature: 1oxgn93crjsuren15f1ihpbynhzzhybs X-Rspamd-Server: rspam02 Received-SPF: none (fb.com>: No applicable sender policy available) receiver=imf16; identity=mailfrom; envelope-from=""; helo=mx0a-00082601.pphosted.com; client-ip=67.231.153.30 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1619053808-349671 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Apr 22, 2021 at 12:44:37AM +0000, Alexey Makhalov wrote: > Current implementation of percpu allocator uses total possible number of CPUs (nr_cpu_ids) to > get number of units to allocate per chunk. Every alloc_percpu() request of N bytes will allocate > N*nr_cpu_ids bytes even if the number of present CPUs is much less. Percpu allocator grows by > number of chunks keeping number of units per chunk constant. This is done in that way to > simplify CPU hotplug/remove to have per-cpu area preallocated. > > Problem: This behavior can lead to inefficient memory usage for big server machines and VMs, > where nr_cpu_ids is huge. > > Example from my experiment: > 2 vCPU VM with hotplug support (up to 128): Maybe I'm missing something, but I find the setup very strange. Who needs a 2 cpu machine which *maybe* can be extended to be a 128 CPUs machine on the fly? > [ 0.105989] smpboot: Allowing 128 CPUs, 126 hotplug CPUs > By creating huge amount of active or/and dying memory cgroups, I can generate active percpu > allocations of 100 MB (per single CPU) including fragmentation overhead. But in that case total > percpu memory consumption (reported in /proc/meminfo) will be 12.8 GB. BTW, chunks are > filled by ~75% in my experiment, so fragmentation is not a concern. > Out of 12.8 GB: > - 0.2 GB are actually used by present vCPUs, and > - 12.6 GB are "wasted"! > > I've seen production VMs consuming 16-20 GB of memory by Percpu. Roman reported 100 GB. My case is completely different and has nothing to do with this problem: the machine had a huge number of outstanding percpu allocations, caused by another problem. > There are solutions to reduce "wasted" memory overhead such as: disabling CPU hotplug; reducing > number of maximum CPUs reported by hypervisor or/and firmware; using possible_cpus= kernel > parameter. But it won't eliminate fundamental issue with "wasted" memory. > > Suggestion: To support percpu chunks scaling by number of units there. To allocate/deallocate new > units for existing chunks on CPU hotplug/remove event. I guess most of users don't have this problem because the number of possible cpus and the actual number of cpus are usually equal or not that different. Someone who really depends on a such setup can try implementing it, but I'm not sure it's trivial/possible to do without adding an overhead for the majority of users.