From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,UNPARSEABLE_RELAY,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D7C3CC2D0A3 for ; Mon, 9 Nov 2020 12:27:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 78024206D8 for ; Mon, 9 Nov 2020 12:27:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729523AbgKIM1Z (ORCPT ); Mon, 9 Nov 2020 07:27:25 -0500 Received: from out4436.biz.mail.alibaba.com ([47.88.44.36]:34314 "EHLO out4436.biz.mail.alibaba.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729045AbgKIM1Y (ORCPT ); Mon, 9 Nov 2020 07:27:24 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R861e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04407;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0UEkOxMS_1604924830; Received: from 30.21.164.53(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0UEkOxMS_1604924830) by smtp.aliyun-inc.com(127.0.0.1); Mon, 09 Nov 2020 20:27:10 +0800 Subject: Re: [PATCH] arm64: PCI: Validate the node before setting node id for root bus From: Baolin Wang To: Lorenzo Pieralisi Cc: Will Deacon , catalin.marinas@arm.com, baolin.wang7@gmail.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org References: <1600770804-116365-1-git-send-email-baolin.wang@linux.alibaba.com> <20200928140054.GA11500@willie-the-truck> <20200928144957.GA90366@VM20190228-100.tbsite.net> <20200928152326.GA15640@e121166-lin.cambridge.arm.com> <26284ca5-ea05-0496-629d-9951f49dda8f@linux.alibaba.com> <20201001085538.GA5142@e121166-lin.cambridge.arm.com> Message-ID: Date: Mon, 9 Nov 2020 20:27:09 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.4.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Lorenzo, > >> On Tue, Sep 29, 2020 at 11:41:29PM +0800, Baolin Wang wrote: >>> Hi, >>> >>> 锟斤拷 2020/9/28 23:23, Lorenzo Pieralisi 写锟斤拷: >>>> On Mon, Sep 28, 2020 at 10:49:57PM +0800, Baolin Wang wrote: >>>>> On Mon, Sep 28, 2020 at 03:00:55PM +0100, Will Deacon wrote: >>>>>> [+ Lorenzo] >>>>>> >>>>>> On Tue, Sep 22, 2020 at 06:33:24PM +0800, Baolin Wang wrote: >>>>>>> If the BIOS disabled the NUMA configuration, but did not change the >>>>>>> proximity domain description in the SRAT table, so the PCI root bus >>>>>>> device may get a incorrect node id by acpi_get_node(). >>>>>> >>>>>> How "incorrect" are we talking here? What actually goes wrong? At >>>>>> some >>>>>> point, we have to trust what the firmware is telling us. >>>>> >>>>> What I mean is, if we disable the NUMA from BIOS >>>> >>>> Please define what this means ie are you removing SRAT from ACPI static >>>> tables ? >>> >>> Yes. >>> >>>> >>>>> but we did not change the PXM for the PCI devices, >>>> >>>> If a _PXM maps to a proximity domain that is not described in the SRAT >>>> your firmware is buggy. >>> >>> Sorry for confusing, that's not what I mean. When the BIOS disable >>> the NUMA >>> (remove the SRAT table), but the PCI devices' _PXM description is still >>> available, which means we can still get the pxm from >>> acpi_evaluate_integer() >>> in this case. >> >> There should not be a _PXM object if the SRAT is not available, that's >> a firmware bug. >> >>> So we can get below inconsistent log on ARM platform: >>> "No NUMA configuration found >>> PCI_bus 0000:00 on NUMA node 0 >>> ... >>> PCI_bus 0000:e3 on NUMA node 1" >>> >>> On X86, the pci_acpi_root_get_node() will validate the node before >>> setting >>> the node id for root bus. So I think we can add this validation for ARM >>> platform. Or anything else I missed? >> >> We are not adding checks because x86 does it, it is certainly to paper >> over a firmware bug that you hopefully still have a chance to fix, >> let's do that instead of adding code that is not necessary. > > Thanks for your input, and I will check this issue with our firmware > colleagues again. Sorry for late reply. I did some investigation for this issue. I am sorry I made some misleading description in the commit message. The issue is, when we want to disable the NUMA from firmware, we usually just remove the SRAT table from the BIOS. But the devices' proximity domain information is still remain in the ACPI tables. For example, the IORT table still contains the proximity domain information for the SMMU devices, so in this case, the SMMU devices still can get incorrect NUMA nodes if we remove the SRAT table. But the SMMU devices will validate the numa node in arm_smmu_v3_set_proximity() to avoid this issue. static int __init arm_smmu_v3_set_proximity(struct device *dev, struct acpi_iort_node *node) { struct acpi_iort_smmu_v3 *smmu; smmu = (struct acpi_iort_smmu_v3 *)node->node_data; if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) { int dev_node = pxm_to_node(smmu->pxm); if (dev_node != NUMA_NO_NODE && !node_online(dev_node)) return -EINVAL; set_dev_node(dev, dev_node); pr_info("SMMU-v3[%llx] Mapped to Proximity domain %d\n", smmu->base_address, smmu->pxm); } return 0; } So similar with SMMU devices, the DSDT table will still contain the PCI root host devices' proximity domain though we already remove the SRAT table. So I think we still need this validation in pcibios_root_bridge_prepare() to avoid this issue like other devices did. I can update the commit message in next version if you think this is reasonable. Thanks.