From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE791C7115A for ; Thu, 19 Jun 2025 01:50:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 487976B00A9; Wed, 18 Jun 2025 21:50:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 438C56B00AA; Wed, 18 Jun 2025 21:50:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 34D976B00AB; Wed, 18 Jun 2025 21:50:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1818F6B00A9 for ; Wed, 18 Jun 2025 21:50:40 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id ACBB0814ED for ; Thu, 19 Jun 2025 01:50:39 +0000 (UTC) X-FDA: 83570470998.19.ADFEE3E Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf03.hostedemail.com (Postfix) with ESMTP id 099C52000D for ; Thu, 19 Jun 2025 01:50:37 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=DprQIoOq; spf=pass (imf03.hostedemail.com: domain of dennis@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=dennis@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750297838; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uy34wmVUyfhVIti0h0R45hoMZ/Yiq/VJlm5YnYku1Rc=; b=uCpR4ScLXSs9myz8tRU2lEL2VR12uB7WnDPfl5dcIoR7KU6wxNftQz5dxkQ1JGICkfhL14 +HGQdoVWuCMVhoINYkm+aqxJ5MrL6hjPKHC2ShOotQYpRAtifcvrEabnDsivgy4gAupJnq MWQHj7uw5A3mL51svkqgFd3gvXp8kxc= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=DprQIoOq; spf=pass (imf03.hostedemail.com: domain of dennis@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=dennis@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750297838; a=rsa-sha256; cv=none; b=tj/Wx6nJzwTqmca1mhrxv7MuT2/aDDiYhJzhwkZdJp+p/cK/xbYJuKDYZURHmWZF+K9E4F rJGiUGhec1xY2KpTz/TeIFqfh78ShqlxHlYzCR1DwNPdohAbXZ+kPYNuDF91qocGVcdI0R s8Y1BsbBxWUXkDm6zbyhC/c43NoVlTs= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 1F2A2A52B63; Thu, 19 Jun 2025 01:50:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6F0B2C4CEE7; Thu, 19 Jun 2025 01:50:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750297836; bh=Yj0MYCrNqBIinHMGnK6vh7xkaIVpHNfjWU9gJ/dSpLU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=DprQIoOqkwwLGym6+VeY75y6DyjRbkq7GHPhMlWIVDsCbjNHdLf1aIqmoqYoPrgwM 4jNBf4ub3SqUAm1IM0GnLGp6irp0WfwzovyWGWvKOjUxJKwR24+XXQGf46fwjXgmpl CVa8ce8qe/8QL2A6hSkSQm/C0HjmzFS86dVCvXSDG3ZwK0KOUt0ecIjR9nKQKIUjjW 06qoR4Ukq/ivTUatVvsDhAYU2zon04jd24h3/xaNH7w38wQrmAC1YN38S05NhyV3Xr ezJHWr0oQNHV1cnAtzVElWxmq1DDJ6dBgWLExMlhZpcTDw66/mGKhg35X6rZjketgK EqRE86BY1LSLw== Date: Wed, 18 Jun 2025 18:50:34 -0700 From: Dennis Zhou To: Hyunmin Lee Cc: linux-mm@kvack.org, Tejun Heo , Christoph Lameter Subject: Re: [PATCH] percpu: reduce the time complexity of mapping units and cpus Message-ID: References: <20250613042138.10083-1-hyunminrlee@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250613042138.10083-1-hyunminrlee@gmail.com> X-Rspamd-Queue-Id: 099C52000D X-Stat-Signature: wtmufc5exui53ju6zehic8fzzj15gdce X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1750297837-389022 X-HE-Meta: U2FsdGVkX19It4C8ja73wFt8KQ0zrptYqQ/CJw/jqumGdxSS/MX7FECmp/+K+dAQ9095uyz9ueEInajuViuWksgqg7sUM0X3b26EnCW8Xr1Zlq2qa6jJ9sP8urfEJ6rBaitaP5/K10Q15HuAv6yL9xAmE5VrlxbbTk+jMD/V60lEaNPba2cdxleU6jTTW7JjPkon+UUP9qE7ROa+pVpvWkT6DUPRQ84fjYIGWofnsQrildjE3z7dVl4exu+GqaxO7bM15v+GInM7U277NU5aNWj91Xz5mkPNqweIR8lZ9JC+yerevg90lLaR39+YOE8J/p0SvsCUD92/9QInj56sP8Y/q8NfRSVTTVHcUHaMbn88Kpk8arhylELrwjYVlsje1Nf7j83ZycRg+CdESox0y0Y4t2qlGjKAu2o3xpBfF0MthKynqJ4CZJCtrMZxybfVVHF8knfvkiP0JGWNJpMqP3G5vUh6cFJkIKScQi1GtVom3+XKLAnryPZq5gOO2L22pDpGBSw+HeiSN6KwAlqAOJV150WUoMoAJ4p+2avLHstJ9Px1TdWF58doRDxLbNMQLIKmml74girbaVoeAH1WsQqx2JXDD5TEeDnON4EOriNUQdj3LGFCP886q31B+wQZ83MmtBfQ44Vx1CpO1++0MoQX3f36oUTWat9KtRhvrbDc1b0Txo0f3n4W62raaaNEv+lK9xlzVwJV3Zu1XH/ijsrpbU424zGcBLXWdO510Os5cbfYA79XBcL92DzLHV1O2YLdBmdTqvN+iHCxNpPIwdRd0ssA48SdUskvfb14q5Rp0W17pSeOWzo6WJe3D+03iHTL6qF6AtYtSsnuOihaGSKANBc8Ow0gf91AkevvFv9jSQZRqW/+gGo4LdSdldxIxzzm0ADMkHT/zLp5RLw0e0OJwUkQvWq8UxZAmRWWJH4IBY9A1zM5Br3brwBo+yrpB7yVdBqZocG/EtdE3cI 8gco4TIW sY/nSz/g6T7eyo56jKBIl/OHTaQjZR48HgxiRhM2ttfdxbJYrRr4mbvcSKA3kI0PvwTj1q++ngO324tGNDe1R7Vy4LPJe1db+ela8V9rVmkDhe/DJfDqou+cpbZ8BtmAQvuMRt25IPdFRkJ2apDmj0MVmttC/R5Z0WHHmDFD4/q/0bP2YDVKueaWL2K/4u8NPwfRZvlE2u50oNilffeEYDFxZI94HfERGxX2aaSmL9tZfQU9IIbW4kj38RlMU7CDSvOM0kIDxUbnFTN+Z5haZWvZNnw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hello, On Fri, Jun 13, 2025 at 01:21:38PM +0900, Hyunmin Lee wrote: > For mapping units and CPUs belonging to groups, it can be inefficient to > iterate through all CPUs by each group to find what CPUs belong to that > group. > > Since group_map already has the information on which CPUs belong to which > group, CPUs can be directly mapped to a unit in the group to which the CPU > belongs. > Idk. For any single socket machine, it's even money. For large machines with say 4+ numa nodes, you're not really going to notice the microseconds the extra for loop is going to take. I'm neutral but inclined to prefer what's already there. > Signed-off-by: Hyunmin Lee > --- > mm/percpu.c | 10 +++++++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > diff --git a/mm/percpu.c b/mm/percpu.c > index b35494c8ede2..968aa0ace482 100644 > --- a/mm/percpu.c > +++ b/mm/percpu.c > @@ -2906,6 +2906,13 @@ static struct pcpu_alloc_info * __init __flatten pcpu_build_alloc_info( > ai->atom_size = atom_size; > ai->alloc_size = alloc_size; > > + for_each_possible_cpu(cpu) { > + group = group_map[cpu]; > + struct pcpu_group_info *gi = &ai->groups[group]; > + > + gi->cpu_map[gi->nr_units++] = cpu; > + } > + Nit: the struct declaration goes at the top in the block. for_each_possible_cpu(cpu) { struct pcpu_group_info *gi; group = group_map[cpu]; gi = &ai->groups[group]; gi->cpu_map[gi->nr_units++] = cpu; } Or condense it into a single line: for_each_possible_cpu(cpu) { struct pcpu_group_info *gi = &ai->groups[group_map[cpu]]; gi->cpu_map[gi->nr_units++] = cpu; } > for (group = 0, unit = 0; group < nr_groups; group++) { > struct pcpu_group_info *gi = &ai->groups[group]; > > @@ -2916,9 +2923,6 @@ static struct pcpu_alloc_info * __init __flatten pcpu_build_alloc_info( > */ > gi->base_offset = unit * ai->unit_size; > > - for_each_possible_cpu(cpu) > - if (group_map[cpu] == group) > - gi->cpu_map[gi->nr_units++] = cpu; > gi->nr_units = roundup(gi->nr_units, upa); > unit += gi->nr_units; > } > -- > 2.43.0 > Thanks, Dennis