From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29BA0CA9EAE for ; Thu, 31 Oct 2019 03:28:27 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 92D0620859 for ; Thu, 31 Oct 2019 03:28:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 92D0620859 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 473W4S3T07zF4Qk for ; Thu, 31 Oct 2019 14:28:24 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=huawei.com (client-ip=45.249.212.32; helo=huawei.com; envelope-from=linyunsheng@huawei.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=huawei.com Received: from huawei.com (szxga06-in.huawei.com [45.249.212.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 473W232QsBzDqmX for ; Thu, 31 Oct 2019 14:26:17 +1100 (AEDT) Received: from DGGEMS404-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id A11698C8659FB5227AE0; Thu, 31 Oct 2019 11:26:07 +0800 (CST) Received: from [127.0.0.1] (10.74.191.121) by DGGEMS404-HUB.china.huawei.com (10.3.19.204) with Microsoft SMTP Server id 14.3.439.0; Thu, 31 Oct 2019 11:26:05 +0800 Subject: Re: [PATCH v7] numa: make node_to_cpumask_map() NUMA_NO_NODE aware To: Peter Zijlstra References: <1572428068-180880-1-git-send-email-linyunsheng@huawei.com> <20191030101449.GW4097@hirez.programming.kicks-ass.net> From: Yunsheng Lin Message-ID: Date: Thu, 31 Oct 2019 11:26:04 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: <20191030101449.GW4097@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.74.191.121] X-CFilter-Loop: Reflected X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dalias@libc.org, linux-sh@vger.kernel.org, catalin.marinas@arm.com, dave.hansen@linux.intel.com, heiko.carstens@de.ibm.com, jiaxun.yang@flygoat.com, linux-mips@vger.kernel.org, mwb@linux.vnet.ibm.com, paulus@samba.org, hpa@zytor.com, sparclinux@vger.kernel.org, chenhc@lemote.com, will@kernel.org, cai@lca.pw, linux-s390@vger.kernel.org, ysato@users.sourceforge.jp, linux-acpi@vger.kernel.org, x86@kernel.org, rppt@linux.ibm.com, borntraeger@de.ibm.com, dledford@redhat.com, mingo@redhat.com, jeffrey.t.kirsher@intel.com, jhogan@kernel.org, mattst88@gmail.com, lenb@kernel.org, len.brown@intel.com, gor@linux.ibm.com, anshuman.khandual@arm.com, gregkh@linuxfoundation.org, bp@alien8.de, luto@kernel.org, bhelgaas@google.com, tglx@linutronix.de, mhocko@kernel.org, naveen.n.rao@linux.vnet.ibm.com, linux-arm-kernel@lists.infradead.org, rth@twiddle.net, axboe@kernel.dk, linux-pci@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, rjw@rjwysocki.net, linux-kernel@vger.kernel.org, ralf@linux-mips.org, tbogendoerfer@suse.de, paul.burton@mips.com, linux-alpha@vger.kernel.org, rafael@kernel.org, ink@jurassic.park.msu.ru, akpm@linux-foundation.org, robin.murphy@arm.com, davem@davemloft.net Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On 2019/10/30 18:14, Peter Zijlstra wrote: > On Wed, Oct 30, 2019 at 05:34:28PM +0800, Yunsheng Lin wrote: >> When passing the return value of dev_to_node() to cpumask_of_node() >> without checking if the device's node id is NUMA_NO_NODE, there is >> global-out-of-bounds detected by KASAN. >> >> From the discussion [1], NUMA_NO_NODE really means no node affinity, >> which also means all cpus should be usable. So the cpumask_of_node() >> should always return all cpus online when user passes the node id as >> NUMA_NO_NODE, just like similar semantic that page allocator handles >> NUMA_NO_NODE. >> >> But we cannot really copy the page allocator logic. Simply because the >> page allocator doesn't enforce the near node affinity. It just picks it >> up as a preferred node but then it is free to fallback to any other numa >> node. This is not the case here and node_to_cpumask_map will only restrict >> to the particular node's cpus which would have really non deterministic >> behavior depending on where the code is executed. So in fact we really >> want to return cpu_online_mask for NUMA_NO_NODE. >> >> Also there is a debugging version of node_to_cpumask_map() for x86 and >> arm64, which is only used when CONFIG_DEBUG_PER_CPU_MAPS is defined, this >> patch changes it to handle NUMA_NO_NODE as normal node_to_cpumask_map(). >> >> [1] https://lkml.org/lkml/2019/9/11/66 >> Signed-off-by: Yunsheng Lin >> Suggested-by: Michal Hocko >> Acked-by: Michal Hocko >> Acked-by: Paul Burton # MIPS bits > > Still: > > Nacked-by: Peter Zijlstra (Intel) It seems I still misunderstood your meaning by "We must not silently accept NO_NODE there" in [1]. I am not sure if there is still disagreement that the NO_NODE state for dev->numa_node should exist at all. >From the previous disscussion [2], you seem to propose to do "wild guess" or "fixup" for all devices(including virtual and physcial) with NO_NODE, which means the NO_NODE is needed anymore and should be removed when the "wild guess" or "fixup" is done. So maybe the reason for your nack here it is that there should be no other NO_NODE handling or fixing related to NO_NODE before the "wild guess" or "fixup" process is finished, so making node_to_cpumask_map() NUMA_NO_NODE aware is unnecessary. Or your reason for the nack is still specific to the pcie device without a numa node, the "wild guess" need to be done for this case before making node_to_cpumask_map() NUMA_NO_NODE? Please help to clarify the reason for nack. Or is there still some other reason for the nack I missed from the previous disscussion? Thanks. [1] https://lore.kernel.org/lkml/20191011111539.GX2311@hirez.programming.kicks-ass.net/ [2] https://lore.kernel.org/lkml/20191014094912.GY2311@hirez.programming.kicks-ass.net/ > > . >