From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF480C05027 for ; Tue, 14 Feb 2023 12:33:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 163416B0073; Tue, 14 Feb 2023 07:33:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 114576B0074; Tue, 14 Feb 2023 07:33:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F1CF56B0075; Tue, 14 Feb 2023 07:33:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id E17C16B0073 for ; Tue, 14 Feb 2023 07:33:13 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 932571A0B83 for ; Tue, 14 Feb 2023 12:33:13 +0000 (UTC) X-FDA: 80465837466.19.46DB0A7 Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by imf15.hostedemail.com (Postfix) with ESMTP id 8A711A001D for ; Tue, 14 Feb 2023 12:33:10 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=yUnXjW3h; spf=pass (imf15.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676377991; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jutmcD4jSNATI+S4GeRwN2zwV5Pm0sRRKs806WAK/7M=; b=Vpn1dSUKg1G9bGXm9tiKZN/t/tFBZ1CiIBRlRwk+KnVhCyoMD11ERaO5I/CV1V+2KYtAwp fjc9kMRaWcfK9uzfLuurjQ8FLyxSc62Y/F9yfsnEvlejJ5SmZ3Pc/CJDiotayaRQXL67u2 d3Jm+Wb9PIJXgzGR3PNkpzdJ6bjPmj0= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=yUnXjW3h; spf=pass (imf15.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.214.172 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676377991; a=rsa-sha256; cv=none; b=i+FpaJUlVHL4uE7aJdk5eW5rjYDPlAut6OaVBzGtGXXrVjpNByHFPY2yCj091zDNxnCL4k eorTt6W9yS0oQHKjhYbeWgaa5K5iJpAAJ0nWZPoGa4ocm42uMRX7oZvfWlTwH70N5wptVu 7Cjy7nwBU+VoSn84CZppgkhem9lfISs= Received: by mail-pl1-f172.google.com with SMTP id r8so16929775pls.2 for ; Tue, 14 Feb 2023 04:33:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; t=1676377989; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=jutmcD4jSNATI+S4GeRwN2zwV5Pm0sRRKs806WAK/7M=; b=yUnXjW3hI5+VBZQIkv9K9DgzqBnUW4hN3QxbGeAkFx1qy3Z6AXr0EZ4b38VDRcNanl 8oIV/53oPXQsw0HvuzxZM5sGPBTWwdmL84SRBElBoVwJ7ZevF7F69YT9hwWnVzNK1KTC Q0qETFhPlhJUQJbsMvC1n55p2+iWLaWu0vS61+zUmHMAQeEySuWLICyyj1ZaWU/GTcdd XocogZzZOFT4vRgQ5yiOt3yIapHAusNlFKtNa3mG7C3TSVlUUxPWSSQ2wzbmLncgO0On HSHX6pfEfZeGnvsitebZ1569wquBd4bQBNVOzUY4R8iWcYpXl5SJFWQdZwkfPPE9wCdC mZ0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1676377989; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=jutmcD4jSNATI+S4GeRwN2zwV5Pm0sRRKs806WAK/7M=; b=L4dZR6kXCD6v64kn/lIDhCfys/uX/W4pfOlLo3j3WNm9nXf4PtQ/VOdU05vPO9Hfqx bpZWEwPh90yMohJtfdAVm7fSjCOQ6dZVYNQia+/JobHMnUKS3UIZw0PhJZb25v1yIJaZ Y9n/7epmicBDF10u9jLRKDk/gGnrpC8TyCc4R/wgwnUHlp7RTNt7eG+1WXO1XQeQfWEi PV+A/WbFDNmz3qaQqvyNgyPoG+ybvpFwfrR1e8E6lKOy6nyYWi88q+TpO+1xYv7zviRJ E24aQ0H7B0HWwgfxVC+0+7bVD737JeBFx44y6CCVV89RPYLxzKD/hJ2D4jrksksqn0Iu lSNg== X-Gm-Message-State: AO0yUKWjCzMBABZMysflveyBGw4uG3YkiSOCY7QCU46BGhjoLEc3bGSQ nE7tFE5Vm8cUtzd8b7a+pz5WyA== X-Google-Smtp-Source: AK7set9w9LKMqYtL583q3vYcQx6jjoA3kSbBRe2e1eKS6HeqJqPJ60al1GH7DN+dxppcIpYG1U9e+w== X-Received: by 2002:a17:902:f549:b0:197:8e8e:f15 with SMTP id h9-20020a170902f54900b001978e8e0f15mr2748471plf.6.1676377989265; Tue, 14 Feb 2023 04:33:09 -0800 (PST) Received: from [10.200.11.190] ([139.177.225.228]) by smtp.gmail.com with ESMTPSA id y17-20020a170902b49100b001947ba0ac8fsm10063019plr.236.2023.02.14.04.33.03 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 14 Feb 2023 04:33:08 -0800 (PST) Message-ID: <632e6be8-f1e2-b57f-a70c-f3aec3adabd1@bytedance.com> Date: Tue, 14 Feb 2023 20:33:00 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.7.2 Subject: Re: [PATCH] mm: page_alloc: don't allocate page from memoryless nodes Content-Language: en-US To: Mike Rapoport , David Hildenbrand Cc: Qi Zheng , Vlastimil Babka , akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Teng Hu , Matthew Wilcox , Mel Gorman , Oscar Salvador , Muchun Song References: <20230212110305.93670-1-zhengqi.arch@bytedance.com> <2484666e-e78e-549d-e075-b2c39d460d71@suse.cz> <85af4ada-96c8-1f99-90fa-9b6d63d0016e@bytedance.com> <67240e55-af49-f20a-2b4b-b7d574cd910d@gmail.com> <22f0e262-982e-ea80-e52a-a3c924b31d58@redhat.com> <4386151c-0328-d207-9a71-933ef61817f9@redhat.com> From: Qi Zheng In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Stat-Signature: or4xzycz6a4c3ni7z7z8w7w9gmoicmdq X-Rspam-User: X-Rspamd-Queue-Id: 8A711A001D X-Rspamd-Server: rspam06 X-HE-Tag: 1676377990-302329 X-HE-Meta: U2FsdGVkX1+PV++EYELP05sGXhlBw1unPnJd2hcCgArI0w0yw4b6RxQ+DypQANmIuUBuHaj45niBDsN+qMJXHXZcTIc55of793WwednqSj3xqQm4uYXSa5Ng+bnbxzNS5ia3d1fFOjjw46/D+KCA60mylEUyXDRFO8l33fRVASGgTEfhWDsGPLIlPVwe4q6eyiASSzs5xtSceDAv1Wkn6ZQW1H8mOtgf4UladwFvbIc7aZedxAh1Pr863OM4+Xrn1UxRPBWfth4UWfES14NnOiSDD6NldQLwlTiRUOl/rTZtQxCRBm8GZudzIo+CRutZRreUlysYxQqCtoXSOuuLIp8AuVw0mPeVaY6Vs0XiI+8JCW2pHzO+6DcmawmENxnKun+8/6gU+aT5GnZcqPfuUNM2I1j29Lsa5e1d7JIz1toBaYD0H7CWFi6RoJ+r4y8zdZ2p6IstdAUkebQdKtHMUp5gMjxrvDjuxSNs6XM8bTVv+VgyiHzFa7NDjRPu+lgVC7g8r4EvolruHnkS4XB3a1xbATeOmd6wlCfBJi0tqGvAEgytbPT4hqCLtgt7Wt2eDAPngC9VTiIHXMVtSbUxmk0WkMXRYQTUKJeQ8Ldr6Y158fEKgdb59GbRu5+YL5EPNYVedrQUT1R3EqogiQgaPW63U7Y5RRGGIoGOfg0yHM//6RI9pcu/8n33TzvNu4An03oZ7zZXHE8IfNLq/Stz0rQyzGbrvMNLt4iuOxCzo9P00l/eeCWLuPfvuTKHhKWoLurQc8QUPmeNwQnhJ3ENOyVz/hyEPJpyh035zhwuuwn7Ah/IdzMv11GzJcutpLncZYDRwyAi2VLyhYp9Fw69WcoekFcQQLewFEKAIMmLS9twdfoAeJcF7wdRNzhwBvsYRmIjE8/hFq8IoeK3LGh5sdxe25YVYzkfApCugv0M3NYwotRfQO6Ls+u8uBBDGS+cK0SZqT0T/SwhCMryeUe w5JIOXE3 Nh5LGTP4+oexLVUyCh2b+yghYilDvhJACC/LfFaV1RqzrGYOp737WpmUhrF5rfYh+nYWTudhf8qLQRLLIeuc/LXwjsDY5LH40IFx0PSBaqPRg9UClhS/AxMFNCffbP8c3H4S7QSdZE4YElesNJSO+qX6ywcOSiXmTvxwJpHJdDqJAneCmLvYj/2jz4o9MK0TQ1Gyq+bh0N4qnGgsmuT1CU9R7e2ydDZxW89IgX2sOGtH47t/hVcb95SeZM3rcblhSwS/Wzaz93g4mWmVUdphK2xv7xBUscNWIH+Cu0bpS000HBzdBb5t0GH+8LaJYOJsuGhNAJVSXOBcoh1iqPhXpJdDMyHyV+WtMRKzT8DHRbfrTyC7XVsezKpS+TEgqXKx0TPord0gps3GrwnnF2f5UrTblXmm9mlws1luZznMYb0yTlh9s1/H14nJI0aea2RFU7bdojGTNkRVAbh9GdRH1crYzEdIJr12NyJsXQzhJ0tuSKy8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/2/14 19:44, Mike Rapoport wrote: > (added x86 folks) > > On Tue, Feb 14, 2023 at 12:29:42PM +0100, David Hildenbrand wrote: >> On 14.02.23 12:26, Qi Zheng wrote: >>> On 2023/2/14 19:22, David Hildenbrand wrote: >>>> >>>> TBH, this is the first time I hear of NODE_MIN_SIZE and it seems to be a >>>> pretty x86 specific thing. >>>> >>>> Are we sure we want to get NODE_MIN_SIZE involved? >>> >>> Maybe add an arch_xxx() to handle it? >> >> I still haven't figured out what we want to achieve with NODE_MIN_SIZE at >> all. It smells like an arch-specific hack looking at >> >> "Don't confuse VM with a node that doesn't have the minimum amount of >> memory" >> >> Why shouldn't mm-core deal with that? > > Well, a node with <4M RAM is not very useful and bears all the overhead of > an extra live node. > > But, hey, why won't we just drop that '< NODE_MIN_SIZE' and let people with > weird HW configurations just live with this? Just to sum up, whether we deal with '< NODE_MIN_SIZE' or not, IIUC, the following two should be modified: 1) we should skip memoryless nodes completely in find_next_best_node(): @@ -6382,8 +6378,11 @@ int find_next_best_node(int node, nodemask_t *used_node_mask) int min_val = INT_MAX; int best_node = NUMA_NO_NODE; - /* Use the local node if we haven't already */ - if (!node_isset(node, *used_node_mask)) { + /* + * Use the local node if we haven't already. But for memoryless local + * node, we should skip it and fallback to other nodes. + */ + if (!node_isset(node, *used_node_mask) && node_state(node, N_MEMORY)) { node_set(node, *used_node_mask); return node; } This also fixes the bug mentioned in commit message. 2) we should call node_states_clear_node() before build_all_zonelists() in offline_pages(): @@ -1931,12 +1931,12 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, /* reinitialise watermarks and update pcp limits */ init_per_zone_wmark_min(); + node_states_clear_node(node, &arg); if (!populated_zone(zone)) { zone_pcp_reset(zone); build_all_zonelists(NULL); } - node_states_clear_node(node, &arg); if (arg.status_change_nid >= 0) { kcompactd_stop(node); kswapd_stop(node); Otherwise, the node whose N_MEMORY state is about to be cleared will still be established in the fallback list of other nodes. Right? Thanks, Qi > >> I'd appreciate an explanation of the bigger picture, what the issue is and >> what the approach to solve it is (including memory onlining/offlining). >> >> -- >> Thanks, >> >> David / dhildenb >> > -- Thanks, Qi