From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2303C4345F for ; Fri, 19 Apr 2024 14:01:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 47B276B0089; Fri, 19 Apr 2024 10:01:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 429C06B008A; Fri, 19 Apr 2024 10:01:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2F22A6B008C; Fri, 19 Apr 2024 10:01:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 10C1F6B0089 for ; Fri, 19 Apr 2024 10:01:47 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C94A181290 for ; Fri, 19 Apr 2024 14:01:46 +0000 (UTC) X-FDA: 82026444612.01.B8493F2 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by imf19.hostedemail.com (Postfix) with ESMTP id E862A1A0016 for ; Fri, 19 Apr 2024 14:01:42 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of jonathan.cameron@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=jonathan.cameron@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1713535303; a=rsa-sha256; cv=none; b=A7t4Pgd9ElO2o24DAdKikVaD+p63np68ytjWEOU9tpyfxhHIutoM4ygKcDfNAyz9TmdJ9R 4ocGrkNnMmnipm8GQlZ/0ilniiwc+y6hAITVi3TdKyTAYglvmx6GrntK2cTSt+Wl3Bg2XK ogZ2kWZjIwR6Ixbm/tLfCvRjiCo7b88= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of jonathan.cameron@huawei.com designates 185.176.79.56 as permitted sender) smtp.mailfrom=jonathan.cameron@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1713535303; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=F4xX4DgcqF4gtSmqP4sMIT5bln+w76MOjckvq6gOr3k=; b=b00mx8ET81rCOmtkiH/RMgXo3oi8GCNVgWlLRNYOLK8/BYX1DpmqsAI4tigrTfaJRJqbs4 AeDzdOZh7GpMScKagcrt0R/sTJZ7TeeuxPATzX+U24Vdk9aDRXQqzvdMao8nLyq2P+DZrQ yOAXSL1JCUCOD+wSvSXQ+xIRXrrCjmg= Received: from mail.maildlp.com (unknown [172.18.186.31]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4VLbsq4rHjz6K8x1; Fri, 19 Apr 2024 22:01:35 +0800 (CST) Received: from lhrpeml500005.china.huawei.com (unknown [7.191.163.240]) by mail.maildlp.com (Postfix) with ESMTPS id E2C3E140B2A; Fri, 19 Apr 2024 22:01:36 +0800 (CST) Received: from localhost (10.48.153.65) by lhrpeml500005.china.huawei.com (7.191.163.240) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Fri, 19 Apr 2024 15:01:36 +0100 Date: Fri, 19 Apr 2024 15:01:34 +0100 From: Jonathan Cameron To: "Ho-Ren (Jack) Chuang" CC: "Huang, Ying" , Gregory Price , , , , , Eishan Mirakhur , Vinicius Tavares Petrucci , Ravis OpenSrc , Alistair Popple , Srinivasulu Thanneeru , SeongJae Park , Dan Williams , Vishal Verma , "Dave Jiang" , Andrew Morton , , , , , "Ho-Ren (Jack) Chuang" , "Ho-Ren (Jack) Chuang" , , Hao Xiang Subject: Re: [PATCH v11 2/2] memory tier: create CPUless memory tiers after obtaining HMAT info Message-ID: <20240419150134.000032ff@Huawei.com> In-Reply-To: <20240405000707.2670063-3-horenchuang@bytedance.com> References: <20240405000707.2670063-1-horenchuang@bytedance.com> <20240405000707.2670063-3-horenchuang@bytedance.com> Organization: Huawei Technologies Research and Development (UK) Ltd. X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.48.153.65] X-ClientProxiedBy: lhrpeml500006.china.huawei.com (7.191.161.198) To lhrpeml500005.china.huawei.com (7.191.163.240) X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: E862A1A0016 X-Stat-Signature: yu31ciqqwrqy7tjcu95w89didksag3u7 X-Rspam-User: X-HE-Tag: 1713535302-714022 X-HE-Meta: U2FsdGVkX1++UwjUo+5hEk3vQG7j06i9RTe0rLxxm0d1XYRALguCrKYk2OlzVwd7NrXVg79qZ0Isy3xYa2dtPgggEnpwsvhgsPpOZpfl6W52UlBfdUB1CtDMPyspTHGLzo58JDd1wsxFY5bANgoV0ciCJELD/W7pj1O9qCa4s6mHElQptjBqEW9Lq/n8S4VeIoQk7h6E/DittGdn8ETg2WbMJMcgqwf0/1d7EbBQ/yhlnmclX5NVNf+kxag5+FetZjt6+rX0FQYjdvXL+C6Lqw+UQrBKM029hhBXS4yyT6UTEVD6BXtn5QF8Rrs65YrSsWeml6OQWII1qS4nQCYmfAphmlHVlXkndTmpK4RPC0L3utvkQBMwIclqaVDd1AY0bPGYPM1jenwybR6LAsc4GV1pKsrxNKEWOn/eWH8JUrMJxuOytAWjTRtgQWsmXP4pLKIHVmlUQLlM0+niYP4weM0wYPshPWV65VjWiqhc7+kCCySYh0QO7wagJItETuM03eTQRUVgxTdGlDX5p/BFAa0y5mdEIo0wx2ErjP0p49/BD+hjGVKLXcZsGHSAlFmoDvYGo/GY9K9/9/49NWzC758zYV3V+GQqtkqIloeaRckY9U332+MXSkQuucOtW+hJj3CVQewyGuo24qXFXRam+a1bzlNwfUJ4JMKE/rU6NnRHvEU+XTfyKalDw/J0PHZmbGm30Q95qooZbZVCb7+dsU/BsHvvXk29BsEtQ7bWkCE5cmcekoGzo9/5G39e+j4X5Qzed4QpQ3McJNhs3CTCWihfhT/Vc2nU9eTOx7Mdq6EzSP9SCWo7NvWd1BBzFZi8oZ1UZwFeHBveI1E0skxEjY/UueTB0/F7zC+b7igDpeSmkj1OQkwBgFfkaEyiHK9RNgbRNQQ4I8AgdkeMPwe6IpWa6yPldNYt1+vy/0P9wz0JDTSNzKsrX6SXkIqhkEgFzc3PkOU2pe8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, 5 Apr 2024 00:07:06 +0000 "Ho-Ren (Jack) Chuang" wrote: > The current implementation treats emulated memory devices, such as > CXL1.1 type3 memory, as normal DRAM when they are emulated as normal memory > (E820_TYPE_RAM). However, these emulated devices have different > characteristics than traditional DRAM, making it important to > distinguish them. Thus, we modify the tiered memory initialization process > to introduce a delay specifically for CPUless NUMA nodes. This delay > ensures that the memory tier initialization for these nodes is deferred > until HMAT information is obtained during the boot process. Finally, > demotion tables are recalculated at the end. > > * late_initcall(memory_tier_late_init); > Some device drivers may have initialized memory tiers between > `memory_tier_init()` and `memory_tier_late_init()`, potentially bringing > online memory nodes and configuring memory tiers. They should be excluded > in the late init. > > * Handle cases where there is no HMAT when creating memory tiers > There is a scenario where a CPUless node does not provide HMAT information. > If no HMAT is specified, it falls back to using the default DRAM tier. > > * Introduce another new lock `default_dram_perf_lock` for adist calculation > In the current implementation, iterating through CPUlist nodes requires > holding the `memory_tier_lock`. However, `mt_calc_adistance()` will end up > trying to acquire the same lock, leading to a potential deadlock. > Therefore, we propose introducing a standalone `default_dram_perf_lock` to > protect `default_dram_perf_*`. This approach not only avoids deadlock > but also prevents holding a large lock simultaneously. > > * Upgrade `set_node_memory_tier` to support additional cases, including > default DRAM, late CPUless, and hot-plugged initializations. > To cover hot-plugged memory nodes, `mt_calc_adistance()` and > `mt_find_alloc_memory_type()` are moved into `set_node_memory_tier()` to > handle cases where memtype is not initialized and where HMAT information is > available. > > * Introduce `default_memory_types` for those memory types that are not > initialized by device drivers. > Because late initialized memory and default DRAM memory need to be managed, > a default memory type is created for storing all memory types that are > not initialized by device drivers and as a fallback. > > Signed-off-by: Ho-Ren (Jack) Chuang > Signed-off-by: Hao Xiang > Reviewed-by: "Huang, Ying" Reviewed-by: Jonathan Cameron