From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41A03C4725D for ; Fri, 19 Jan 2024 10:32:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C670B6B0075; Fri, 19 Jan 2024 05:32:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C17966B0078; Fri, 19 Jan 2024 05:32:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ADEA96B007B; Fri, 19 Jan 2024 05:32:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 9BA846B0075 for ; Fri, 19 Jan 2024 05:32:10 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 39E611A0C90 for ; Fri, 19 Jan 2024 10:32:10 +0000 (UTC) X-FDA: 81695695620.16.A22558A Received: from mail-lf1-f50.google.com (mail-lf1-f50.google.com [209.85.167.50]) by imf14.hostedemail.com (Postfix) with ESMTP id 24E89100011 for ; Fri, 19 Jan 2024 10:32:07 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=bvJlZHnm; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf14.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.50 as permitted sender) smtp.mailfrom=urezki@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705660328; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZY3pcmZeAfFwTUA8BZh2m7xC5Kmh5ZWYNrlewU2WALI=; b=gtTrO1mukKGf6yOeFW2zR2ecz8n3bRPkfmtK4MwgCk+NnXhkDKgPSPdBlONpCVGD+VNkHH C96CpcDYgs3pYCVp2q0Fs0lvCi6BffmGAy9Q4LxvbL91+ovjMZgZlpYwOlZK1tZ3HOFZBy BRshtFnyxElf/QLz20wLhNOjMPvwW5s= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=bvJlZHnm; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf14.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.50 as permitted sender) smtp.mailfrom=urezki@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705660328; a=rsa-sha256; cv=none; b=VuYb/kH+8tREfnfqtmntJe4agJoeu60eOTxEnFeF8eGEMPD6cYSuwZ9Wo8y4JM5CbRDdt6 D8JvyWCypsDTLCkkJFrklfkXOGh88X7gM8BqCRxJvcIkz0ih9Qxup68GJKxOpF7vuRT8c1 +PN6C4HjAG1Ze+OMFsr4CLl7MF2oVFA= Received: by mail-lf1-f50.google.com with SMTP id 2adb3069b0e04-50e7c6f0487so628445e87.3 for ; Fri, 19 Jan 2024 02:32:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705660326; x=1706265126; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:from:to:cc:subject:date:message-id:reply-to; bh=ZY3pcmZeAfFwTUA8BZh2m7xC5Kmh5ZWYNrlewU2WALI=; b=bvJlZHnmhm7swHSiLOouBEUpHrdeEYuSEshnWSvPvZ9RT31dB1+7QdTZTecmL45Cgm x2Ju9cjiahCVQIQOSh+Q2A/fPWGpUW9ZBtpofSQEDbDyzuHHjxnoNoO4OunrrhRuFQXA vUjEtH62lSyvQJIsIGMKg8E5YRk3urpSw7prYuMGwH8hTZd6HK0HoWOOKsTbqf6bt8Wm TBqniOr1SKb2HyJt0HDlAEApFPkdBBExNvgHGAVBuM4joUK+PNE4wEpk/2KBDbcPxOeC jlBhooGFn82tHbwsxT75GDcxJOfz4hg5vLAftYvYnyQMoCUGKq0QIEQfXchBJ1GOoqgp 3yZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705660326; x=1706265126; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=ZY3pcmZeAfFwTUA8BZh2m7xC5Kmh5ZWYNrlewU2WALI=; b=LRBOWzUiVI8zJK40qFJQ5cIHN35WqzF8bPFaWnQLothLDWb9i6LQi6xC81ScZ5eqOL ioDCjpVdoKxkvdXUlaJJz0ZadXqDhhBpO194vW6ljNu3qZyZHp8X6UUCYtBy4rCxk8k4 TpgW+gA/mokhukUw1LxcqbntQm+MMr7kXAA7x5vO+iE+YuWGvkHcDeDnZhdYa4MhKc8W 44uDH24VnIvK4mC55YqQdHSwq1V2EA0VimqsC4J9Wt6xx7V+w0/5nxFZspnqQarFmjLe sLNH0gHC1AReGpw4v06yIxYbB7q1i56G/EOnC0SFJoaCD1xfsuHWvod3gzvuinBrEy7O tQ9A== X-Gm-Message-State: AOJu0YywUiVHiAGHg2gzspYXMXycfKlDH6pX8k1N9Ywp6ka14W/IzeJN jrRf2x0tKkPC6f/pJZUCkCep8QbGnlillT/8VlovtkWcrtM8yR/1 X-Google-Smtp-Source: AGHT+IFPAWddW55WuSTgI91pRgJ/BAP4LB6YI8g8pVbvW/S8MdHtqX+ruMKIzRy/sVrNZr0QjZXFEg== X-Received: by 2002:a05:6512:79:b0:50e:9367:f0ad with SMTP id i25-20020a056512007900b0050e9367f0admr359447lfo.58.1705660325771; Fri, 19 Jan 2024 02:32:05 -0800 (PST) Received: from pc636 (host-90-235-20-191.mobileonline.telia.com. [90.235.20.191]) by smtp.gmail.com with ESMTPSA id u27-20020ac24c3b000000b0050e788fc8dasm917429lfq.272.2024.01.19.02.32.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Jan 2024 02:32:05 -0800 (PST) From: Uladzislau Rezki X-Google-Original-From: Uladzislau Rezki Date: Fri, 19 Jan 2024 11:32:02 +0100 To: Dave Chinner Cc: Uladzislau Rezki , linux-mm@kvack.org, Andrew Morton , LKML , Baoquan He , Lorenzo Stoakes , Christoph Hellwig , Matthew Wilcox , "Liam R . Howlett" , "Paul E . McKenney" , Joel Fernandes , Oleksiy Avramchenko Subject: Re: [PATCH v3 10/11] mm: vmalloc: Set nr_nodes based on CPUs in a system Message-ID: References: <20240102184633.748113-1-urezki@gmail.com> <20240102184633.748113-11-urezki@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 24E89100011 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 3gw9oyraoi3715hg33qn9ho3bxis78qm X-HE-Tag: 1705660327-123458 X-HE-Meta: U2FsdGVkX19WN+ZLeZGAHsn/PV11aUBG8mYe2MDy6s+bcV81YLB5iXTy+M90mPdgQV4SrpeErJCxi5qNX8Um2AMpl9+Onuo7/but2FbiEjsSYD5mqJ942dXOxEv2sJelmHxG/7UUGKcnwsu45+1hG340jt+9aSXMQLFX3nz8q6GNRVge/BdJoJ1qDPKobttnEzu9K6ZFBPvZwRbB690l6upU9bdjx0qAKJsDYCf2Dgp3aGTKW8yPSGVQNgz6jhx91OYt0LFuS9MsyuacvXUMWE1gN/1aEqYvY1HWtqh7ktatDoTViS2hOQhozk2DLvOkTqCDdIKsO8QLG5zEKfzUhDzqv8/uQY8QAi8TlGuoKkQtZrZF/+7J7FQwAIJ1O8925HGpF5GX49evztIUEYAkbldW0ymPbGodtDP1yGsUW63fS+ZI3RqZC8gIg0xomO131Lkd9zgPJELEBtpFwBvsPbN1NGTHKj8j0o2LcwwPsKiCYkHvZSKBdsO8Z9qTin9jEfPdbWeoLsj6K3DpuUole4azgdhr8aikGrZCl7Umw96UxuSH2TiSqMn58QFYzRMIRlZnPxXQVlOjw3u4d0kxfPV8f+SFJ+5WkhgU5R8jNrVRwvZioCvpiE4qVJP0We2Hjtwx2jCw5SpaTLGRhOgNrdDQSzA2A1PTY7Mxx1jmnCut5uGPPxqToYqNwkYyCy4bNuW/Gw26uQzaHkxLcze2RcxRxaw2z6nNTl1bENkPR9xkT0dIBvBsDsqKm/F8RHpdn6W/IgYdnEIJ9ypwPtmBlYxIZ04YUSP5iylT0ys3hiunIumj9FwGKyornXUSoR48E7QLi1y5wZtqIurUWiFDNfYjeazXz5B/tWYFAU+7R4SPMi9Jm7LmHCBYH8SIPbZYEEOGK7GFdsowKek4WgQ42es6Nss16JpiETkUMk22xoRGtHEresFuxmml97LAT7gBinwKOJ7AIx/R49gjX7K EFONkYLD meO/bWs8WrxRKSbyCCJklUnO3F+5365L59gp7g/UwlZ2jrH+YbsnJ9cUJOxKzFH8Sahl03EwXRNrH/lQHWrKenYTAyUGU0k8Mwi4B8aZpmds0TjRX6kEahFqzqjPZziinwJw16ziAPsTLzauH1UTFtBWfCiRzTyEYW4n19U9rOIbTKX+YMD6E1uzwBjZ4NjqYBFvcztSgACcMlL31nAUPxs0B3RACkIRc183H8jolKhFskXZ7ZrF4t9hjj7kadMo3Nd9PGaB+tQAkl++ZY9kEI1NJHjdMwM0Fdbm0XpZsWVHR3tesqs/piw71FM8WZJ/S6feI0rpu32dxzUyGFFd4NZKumkuqYhWX49++fFIrWSe3abfMCh/ED86DNnaKNF5ADZZxgSjjr6v1BsRZio03Sw4bGDB7DOLv1Q6StSQeImQqRBEtyGyYeESWjAiIkNIe/XSgzhTOuckqtA0qHw4xhXJSCwIRQKpn3oDxm2JchMp4WxBUYHp64F0IcAFqCvG/N9EmttaZFQp0Dk1gwLWHo6kibQKvl8jdB9Ngl5BmAMlnOqw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jan 19, 2024 at 08:28:05AM +1100, Dave Chinner wrote: > On Thu, Jan 18, 2024 at 07:23:47PM +0100, Uladzislau Rezki wrote: > > On Wed, Jan 17, 2024 at 09:06:02AM +1100, Dave Chinner wrote: > > > On Mon, Jan 15, 2024 at 08:09:29PM +0100, Uladzislau Rezki wrote: > > > > We can easily set nr_nodes to num_possible_cpus() and let it scale for > > > > anyone. But before doing this, i would like to give it a try as a first > > > > step because i have not tested it well on really big NUMA systems. > > > > > > I don't think you need to have large NUMA systems to test it. We > > > have the "fakenuma" feature for a reason. Essentially, once you > > > have enough CPU cores that catastrophic lock contention can be > > > generated in a fast path (can take as few as 4-5 CPU cores), then > > > you can effectively test NUMA scalability with fakenuma by creating > > > nodes with >=8 CPUs each. > > > > > > This is how I've done testing of numa aware algorithms (like > > > shrinkers!) for the past decade - I haven't had direct access to a > > > big NUMA machine since 2008, yet it's relatively trivial to test > > > NUMA based scalability algorithms without them these days. > > > > > I see your point. NUMA-aware scalability require reworking adding extra > > layer that allows such scaling. > > > > If the socket has 256 CPUs, how do scale VAs inside that node among > > those CPUs? > > It's called "sub-numa clustering" and is a bios option that presents > large core count CPU packages as multiple NUMA nodes. See: > > https://www.intel.com/content/www/us/en/developer/articles/technical/fourth-generation-xeon-scalable-family-overview.html > > Essentially, large core count CPUs are a cluster of smaller core > groups with their own resources and memory controllers. This is how > they are laid out either on a single die (intel) or as a collection > of smaller dies (AMD compute complexes) that are tied together by > the interconnect between the LLCs and memory controllers. They only > appear as a "unified" CPU because they are configured that way by > the bios, but can also be configured to actually expose their inner > non-uniform memory access topology for operating systems and > application stacks that are NUMA aware (like Linux). > > This means a "256 core" CPU would probably present as 16 smaller 16 > core CPUs each with their own L1/2/3 caches and memory controllers. > IOWs, a single socket appears to the kernel as a 16 node NUMA system > with 16 cores per node. Most NUMA aware scalability algorithms will > work just fine with this sort setup - it's just another set of > numbers in the NUMA distance table... > Thank you for your input. I will go through it to see what we can do in terms of NUMA-aware with thousands of CPUs in total. Thanks! -- Uladzislau Rezki