From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8010BC432C0 for ; Thu, 21 Nov 2019 20:00:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 552ED2068D for ; Thu, 21 Nov 2019 20:00:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726937AbfKUUAT (ORCPT ); Thu, 21 Nov 2019 15:00:19 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:13828 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726546AbfKUUAS (ORCPT ); Thu, 21 Nov 2019 15:00:18 -0500 Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id xALJwqVX078608; Thu, 21 Nov 2019 15:00:16 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 2wdfxukkbd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Nov 2019 15:00:15 -0500 Received: from m0098417.ppops.net (m0098417.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id xALK0F0j082864; Thu, 21 Nov 2019 15:00:15 -0500 Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com [169.63.214.131]) by mx0a-001b2d01.pphosted.com with ESMTP id 2wdfxukka3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Nov 2019 15:00:15 -0500 Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1]) by ppma01dal.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id xALJtswm029687; Thu, 21 Nov 2019 20:00:14 GMT Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by ppma01dal.us.ibm.com with ESMTP id 2wa8r7mxu2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Nov 2019 20:00:14 +0000 Received: from b03ledav001.gho.boulder.ibm.com (b03ledav001.gho.boulder.ibm.com [9.17.130.232]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id xALK0CoJ37224862 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Nov 2019 20:00:12 GMT Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 80D9A6E059; Thu, 21 Nov 2019 20:00:12 +0000 (GMT) Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2747A6E053; Thu, 21 Nov 2019 20:00:12 +0000 (GMT) Received: from rascal.austin.ibm.com (unknown [9.41.179.32]) by b03ledav001.gho.boulder.ibm.com (Postfix) with ESMTP; Thu, 21 Nov 2019 20:00:11 +0000 (GMT) From: Scott Cheloha To: linux-kernel@vger.kernel.org, "Rafael J. Wysocki" , Greg Kroah-Hartman , David Hildenbrand Cc: nathanl@linux.ibm.com, ricklind@linux.vnet.ibm.com, Scott Cheloha Subject: [PATCH v2] drivers/base/memory.c: cache blocks in radix tree to accelerate lookup Date: Thu, 21 Nov 2019 13:59:52 -0600 Message-Id: <20191121195952.3728-1-cheloha@linux.vnet.ibm.com> X-Mailer: git-send-email 2.24.0.rc1 In-Reply-To: <20191120192536.1980-1-cheloha@linux.vnet.ibm.com> References: <20191120192536.1980-1-cheloha@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.95,18.0.572 definitions=2019-11-21_05:2019-11-21,2019-11-21 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 phishscore=0 impostorscore=0 malwarescore=0 bulkscore=0 mlxlogscore=999 clxscore=1015 spamscore=0 mlxscore=0 suspectscore=0 priorityscore=1501 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-1910280000 definitions=main-1911210164 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Searching for a particular memory block by id is slow because each block device is kept in an unsorted linked list on the subsystem bus. Lookup is much faster if we cache the blocks in a radix tree. Memory subsystem initialization and hotplug/hotunplug is at least a little faster for any machine with more than ~100 blocks, and the speedup grows with the block count. Signed-off-by: Scott Cheloha Acked-by: David Hildenbrand --- v2 incorporates suggestions from David Hildenbrand. Removing the get_device/put_device dance from walk_memory_blocks() can be done in a subsequent patch. drivers/base/memory.c | 49 ++++++++++++++++++++++++++----------------- 1 file changed, 30 insertions(+), 19 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 84c4e1f72cbd..32e74fc1b0ac 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include @@ -59,6 +60,13 @@ static struct bus_type memory_subsys = { .offline = memory_subsys_offline, }; +/* + * Memory blocks are cached in a local radix tree to avoid + * a costly linear search for the corresponding device on + * the subsystem bus. + */ +static RADIX_TREE(memory_blocks, GFP_KERNEL); + static BLOCKING_NOTIFIER_HEAD(memory_chain); int register_memory_notifier(struct notifier_block *nb) @@ -580,20 +588,14 @@ int __weak arch_get_memory_phys_device(unsigned long start_pfn) /* A reference for the returned memory block device is acquired. */ static struct memory_block *find_memory_block_by_id(unsigned long block_id) { - struct device *dev; + struct memory_block *mem; - dev = subsys_find_device_by_id(&memory_subsys, block_id, NULL); - return dev ? to_memory_block(dev) : NULL; + mem = radix_tree_lookup(&memory_blocks, block_id); + if (mem) + get_device(&mem->dev); + return mem; } -/* - * For now, we have a linear search to go find the appropriate - * memory_block corresponding to a particular phys_index. If - * this gets to be a real problem, we can always use a radix - * tree or something here. - * - * This could be made generic for all device subsystems. - */ struct memory_block *find_memory_block(struct mem_section *section) { unsigned long block_id = base_memory_block_id(__section_nr(section)); @@ -636,9 +638,15 @@ int register_memory(struct memory_block *memory) memory->dev.offline = memory->state == MEM_OFFLINE; ret = device_register(&memory->dev); - if (ret) + if (ret) { put_device(&memory->dev); - + return ret; + } + ret = radix_tree_insert(&memory_blocks, memory->dev.id, memory); + if (ret) { + put_device(&memory->dev); + device_unregister(&memory->dev); + } return ret; } @@ -696,6 +704,8 @@ static void unregister_memory(struct memory_block *memory) if (WARN_ON_ONCE(memory->dev.bus != &memory_subsys)) return; + WARN_ON(radix_tree_delete(&memory_blocks, memory->dev.id) == NULL); + /* drop the ref. we got via find_memory_block() */ put_device(&memory->dev); device_unregister(&memory->dev); @@ -851,20 +861,21 @@ void __init memory_dev_init(void) int walk_memory_blocks(unsigned long start, unsigned long size, void *arg, walk_memory_blocks_func_t func) { + struct radix_tree_iter iter; const unsigned long start_block_id = phys_to_block_id(start); const unsigned long end_block_id = phys_to_block_id(start + size - 1); struct memory_block *mem; - unsigned long block_id; + void **slot; int ret = 0; if (!size) return 0; - for (block_id = start_block_id; block_id <= end_block_id; block_id++) { - mem = find_memory_block_by_id(block_id); - if (!mem) - continue; - + radix_tree_for_each_slot(slot, &memory_blocks, &iter, start_block_id) { + if (iter.index > end_block_id) + break; + mem = radix_tree_deref_slot(slot); + get_device(&mem->dev); ret = func(mem, arg); put_device(&mem->dev); if (ret) -- 2.24.0.rc1