From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2FDDDC4332F for ; Mon, 6 Nov 2023 18:10:28 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C098A10E03D; Mon, 6 Nov 2023 18:10:26 +0000 (UTC) Received: from bmailout1.hostsharing.net (bmailout1.hostsharing.net [83.223.95.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 810E110E03D; Mon, 6 Nov 2023 18:10:25 +0000 (UTC) Received: from h08.hostsharing.net (h08.hostsharing.net [83.223.95.28]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "*.hostsharing.net", Issuer "RapidSSL Global TLS RSA4096 SHA256 2022 CA1" (verified OK)) by bmailout1.hostsharing.net (Postfix) with ESMTPS id D4A49300002D5; Mon, 6 Nov 2023 19:10:22 +0100 (CET) Received: by h08.hostsharing.net (Postfix, from userid 100393) id CCA24473B55; Mon, 6 Nov 2023 19:10:22 +0100 (CET) Date: Mon, 6 Nov 2023 19:10:22 +0100 From: Lukas Wunner To: Mario Limonciello Subject: Re: [PATCH v2 8/9] PCI: Exclude PCIe ports used for tunneling in pcie_bandwidth_available() Message-ID: <20231106181022.GA18564@wunner.de> References: <20231103190758.82911-1-mario.limonciello@amd.com> <20231103190758.82911-9-mario.limonciello@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231103190758.82911-9-mario.limonciello@amd.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-BeenThere: amd-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussion list for AMD gfx List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:THUNDERBOLT DRIVER" , Karol Herbst , "Rafael J . Wysocki" , "open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS" , "open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS" , "open list:X86 PLATFORM DRIVERS" , Andreas Noever , Alex Deucher , David Airlie , Marek =?iso-8859-1?Q?Beh=FAn?= , "open list:RADEON and AMDGPU DRM DRIVERS" , "open list:ACPI" , Danilo Krummrich , "open list:PCI SUBSYSTEM" , Ilpo =?iso-8859-1?Q?J=E4rvinen?= , Manivannan Sadhasivam , Michael Jamet , Mark Gross , Hans de Goede , Bjorn Helgaas , Mika Westerberg , Xinhui Pan , open list , Daniel Vetter , Yehezkel Bernat , Pali =?iso-8859-1?Q?Roh=E1r?= , Christian =?iso-8859-1?Q?K=F6nig?= , "Maciej W . Rozycki" Errors-To: amd-gfx-bounces@lists.freedesktop.org Sender: "amd-gfx" On Fri, Nov 03, 2023 at 02:07:57PM -0500, Mario Limonciello wrote: > The USB4 spec specifies that PCIe ports that are used for tunneling > PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s and > behave as a PCIe Gen1 device. The actual performance of these ports is > controlled by the fabric implementation. > > Downstream drivers such as amdgpu which utilize pcie_bandwidth_available() > to program the device will always find the PCIe ports used for > tunneling as a limiting factor potentially leading to incorrect > performance decisions. > > To prevent problems in downstream drivers check explicitly for ports > being used for PCIe tunneling and skip them when looking for bandwidth > limitations of the hierarchy. If the only device connected is a root port > used for tunneling then report that device. I think a better approach would be to define three new bandwidths for Thunderbolt in enum pci_bus_speed and add appropriate descriptions in pci_speed_string(). Those three bandwidths would be 10 GBit/s for Thunderbolt 1, 20 GBit/s for Thunderbolt 2, 40 GBit/s for Thunderbolt 3 and 4. Code to determine the Thunderbolt generation from the PCI ID already exists in tb_switch_get_generation(). This will not only address the amdgpu issue you're trying to solve, but also emit an accurate speed from __pcie_print_link_status(). The speed you're reporting with your approach is not necessarily accurate because the next non-tunneled device in the hierarchy might be connected with a far higher PCIe speed than what the Thunderbolt fabric allows. Thanks, Lukas From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D9B4929412; Mon, 6 Nov 2023 18:10:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=none Received: from bmailout1.hostsharing.net (bmailout1.hostsharing.net [IPv6:2a01:37:1000::53df:5f64:0]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 48354D47; Mon, 6 Nov 2023 10:10:28 -0800 (PST) Received: from h08.hostsharing.net (h08.hostsharing.net [83.223.95.28]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "*.hostsharing.net", Issuer "RapidSSL Global TLS RSA4096 SHA256 2022 CA1" (verified OK)) by bmailout1.hostsharing.net (Postfix) with ESMTPS id D4A49300002D5; Mon, 6 Nov 2023 19:10:22 +0100 (CET) Received: by h08.hostsharing.net (Postfix, from userid 100393) id CCA24473B55; Mon, 6 Nov 2023 19:10:22 +0100 (CET) Date: Mon, 6 Nov 2023 19:10:22 +0100 From: Lukas Wunner To: Mario Limonciello Cc: Karol Herbst , Lyude Paul , Alex Deucher , Christian =?iso-8859-1?Q?K=F6nig?= , Bjorn Helgaas , Hans de Goede , Ilpo =?iso-8859-1?Q?J=E4rvinen?= , Mika Westerberg , Danilo Krummrich , David Airlie , Daniel Vetter , Xinhui Pan , "Rafael J . Wysocki" , Mark Gross , Andreas Noever , Michael Jamet , Yehezkel Bernat , Pali =?iso-8859-1?Q?Roh=E1r?= , Marek =?iso-8859-1?Q?Beh=FAn?= , "Maciej W . Rozycki" , Manivannan Sadhasivam , "open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS" , "open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS" , open list , "open list:RADEON and AMDGPU DRM DRIVERS" , "open list:PCI SUBSYSTEM" , "open list:ACPI" , "open list:X86 PLATFORM DRIVERS" , "open list:THUNDERBOLT DRIVER" Subject: Re: [PATCH v2 8/9] PCI: Exclude PCIe ports used for tunneling in pcie_bandwidth_available() Message-ID: <20231106181022.GA18564@wunner.de> References: <20231103190758.82911-1-mario.limonciello@amd.com> <20231103190758.82911-9-mario.limonciello@amd.com> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231103190758.82911-9-mario.limonciello@amd.com> User-Agent: Mutt/1.10.1 (2018-07-13) On Fri, Nov 03, 2023 at 02:07:57PM -0500, Mario Limonciello wrote: > The USB4 spec specifies that PCIe ports that are used for tunneling > PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s and > behave as a PCIe Gen1 device. The actual performance of these ports is > controlled by the fabric implementation. > > Downstream drivers such as amdgpu which utilize pcie_bandwidth_available() > to program the device will always find the PCIe ports used for > tunneling as a limiting factor potentially leading to incorrect > performance decisions. > > To prevent problems in downstream drivers check explicitly for ports > being used for PCIe tunneling and skip them when looking for bandwidth > limitations of the hierarchy. If the only device connected is a root port > used for tunneling then report that device. I think a better approach would be to define three new bandwidths for Thunderbolt in enum pci_bus_speed and add appropriate descriptions in pci_speed_string(). Those three bandwidths would be 10 GBit/s for Thunderbolt 1, 20 GBit/s for Thunderbolt 2, 40 GBit/s for Thunderbolt 3 and 4. Code to determine the Thunderbolt generation from the PCI ID already exists in tb_switch_get_generation(). This will not only address the amdgpu issue you're trying to solve, but also emit an accurate speed from __pcie_print_link_status(). The speed you're reporting with your approach is not necessarily accurate because the next non-tunneled device in the hierarchy might be connected with a far higher PCIe speed than what the Thunderbolt fabric allows. Thanks, Lukas From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7F393C0018A for ; Mon, 6 Nov 2023 18:10:31 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0D3F010E39B; Mon, 6 Nov 2023 18:10:27 +0000 (UTC) Received: from bmailout1.hostsharing.net (bmailout1.hostsharing.net [83.223.95.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 810E110E03D; Mon, 6 Nov 2023 18:10:25 +0000 (UTC) Received: from h08.hostsharing.net (h08.hostsharing.net [83.223.95.28]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "*.hostsharing.net", Issuer "RapidSSL Global TLS RSA4096 SHA256 2022 CA1" (verified OK)) by bmailout1.hostsharing.net (Postfix) with ESMTPS id D4A49300002D5; Mon, 6 Nov 2023 19:10:22 +0100 (CET) Received: by h08.hostsharing.net (Postfix, from userid 100393) id CCA24473B55; Mon, 6 Nov 2023 19:10:22 +0100 (CET) Date: Mon, 6 Nov 2023 19:10:22 +0100 From: Lukas Wunner To: Mario Limonciello Message-ID: <20231106181022.GA18564@wunner.de> References: <20231103190758.82911-1-mario.limonciello@amd.com> <20231103190758.82911-9-mario.limonciello@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231103190758.82911-9-mario.limonciello@amd.com> User-Agent: Mutt/1.10.1 (2018-07-13) Subject: Re: [Nouveau] [PATCH v2 8/9] PCI: Exclude PCIe ports used for tunneling in pcie_bandwidth_available() X-BeenThere: nouveau@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Nouveau development list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:THUNDERBOLT DRIVER" , "Rafael J . Wysocki" , "open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS" , "open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS" , "open list:X86 PLATFORM DRIVERS" , Andreas Noever , Alex Deucher , Marek =?iso-8859-1?Q?Beh=FAn?= , "open list:RADEON and AMDGPU DRM DRIVERS" , "open list:ACPI" , "open list:PCI SUBSYSTEM" , Ilpo =?iso-8859-1?Q?J=E4rvinen?= , Manivannan Sadhasivam , Michael Jamet , Mark Gross , Hans de Goede , Bjorn Helgaas , Mika Westerberg , Xinhui Pan , open list , Daniel Vetter , Yehezkel Bernat , Pali =?iso-8859-1?Q?Roh=E1r?= , Christian =?iso-8859-1?Q?K=F6nig?= , "Maciej W . Rozycki" Errors-To: nouveau-bounces@lists.freedesktop.org Sender: "Nouveau" On Fri, Nov 03, 2023 at 02:07:57PM -0500, Mario Limonciello wrote: > The USB4 spec specifies that PCIe ports that are used for tunneling > PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s and > behave as a PCIe Gen1 device. The actual performance of these ports is > controlled by the fabric implementation. > > Downstream drivers such as amdgpu which utilize pcie_bandwidth_available() > to program the device will always find the PCIe ports used for > tunneling as a limiting factor potentially leading to incorrect > performance decisions. > > To prevent problems in downstream drivers check explicitly for ports > being used for PCIe tunneling and skip them when looking for bandwidth > limitations of the hierarchy. If the only device connected is a root port > used for tunneling then report that device. I think a better approach would be to define three new bandwidths for Thunderbolt in enum pci_bus_speed and add appropriate descriptions in pci_speed_string(). Those three bandwidths would be 10 GBit/s for Thunderbolt 1, 20 GBit/s for Thunderbolt 2, 40 GBit/s for Thunderbolt 3 and 4. Code to determine the Thunderbolt generation from the PCI ID already exists in tb_switch_get_generation(). This will not only address the amdgpu issue you're trying to solve, but also emit an accurate speed from __pcie_print_link_status(). The speed you're reporting with your approach is not necessarily accurate because the next non-tunneled device in the hierarchy might be connected with a far higher PCIe speed than what the Thunderbolt fabric allows. Thanks, Lukas From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 43BCFC4167B for ; Mon, 6 Nov 2023 18:10:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3EFCA10E39D; Mon, 6 Nov 2023 18:10:27 +0000 (UTC) Received: from bmailout1.hostsharing.net (bmailout1.hostsharing.net [83.223.95.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 810E110E03D; Mon, 6 Nov 2023 18:10:25 +0000 (UTC) Received: from h08.hostsharing.net (h08.hostsharing.net [83.223.95.28]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "*.hostsharing.net", Issuer "RapidSSL Global TLS RSA4096 SHA256 2022 CA1" (verified OK)) by bmailout1.hostsharing.net (Postfix) with ESMTPS id D4A49300002D5; Mon, 6 Nov 2023 19:10:22 +0100 (CET) Received: by h08.hostsharing.net (Postfix, from userid 100393) id CCA24473B55; Mon, 6 Nov 2023 19:10:22 +0100 (CET) Date: Mon, 6 Nov 2023 19:10:22 +0100 From: Lukas Wunner To: Mario Limonciello Subject: Re: [PATCH v2 8/9] PCI: Exclude PCIe ports used for tunneling in pcie_bandwidth_available() Message-ID: <20231106181022.GA18564@wunner.de> References: <20231103190758.82911-1-mario.limonciello@amd.com> <20231103190758.82911-9-mario.limonciello@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231103190758.82911-9-mario.limonciello@amd.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:THUNDERBOLT DRIVER" , Karol Herbst , "Rafael J . Wysocki" , "open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS" , "open list:DRM DRIVER FOR NVIDIA GEFORCE/QUADRO GPUS" , "open list:X86 PLATFORM DRIVERS" , Andreas Noever , Alex Deucher , Marek =?iso-8859-1?Q?Beh=FAn?= , "open list:RADEON and AMDGPU DRM DRIVERS" , "open list:ACPI" , Danilo Krummrich , "open list:PCI SUBSYSTEM" , Ilpo =?iso-8859-1?Q?J=E4rvinen?= , Manivannan Sadhasivam , Michael Jamet , Mark Gross , Hans de Goede , Bjorn Helgaas , Mika Westerberg , Xinhui Pan , open list , Yehezkel Bernat , Pali =?iso-8859-1?Q?Roh=E1r?= , Christian =?iso-8859-1?Q?K=F6nig?= , "Maciej W . Rozycki" Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" On Fri, Nov 03, 2023 at 02:07:57PM -0500, Mario Limonciello wrote: > The USB4 spec specifies that PCIe ports that are used for tunneling > PCIe traffic over USB4 fabric will be hardcoded to advertise 2.5GT/s and > behave as a PCIe Gen1 device. The actual performance of these ports is > controlled by the fabric implementation. > > Downstream drivers such as amdgpu which utilize pcie_bandwidth_available() > to program the device will always find the PCIe ports used for > tunneling as a limiting factor potentially leading to incorrect > performance decisions. > > To prevent problems in downstream drivers check explicitly for ports > being used for PCIe tunneling and skip them when looking for bandwidth > limitations of the hierarchy. If the only device connected is a root port > used for tunneling then report that device. I think a better approach would be to define three new bandwidths for Thunderbolt in enum pci_bus_speed and add appropriate descriptions in pci_speed_string(). Those three bandwidths would be 10 GBit/s for Thunderbolt 1, 20 GBit/s for Thunderbolt 2, 40 GBit/s for Thunderbolt 3 and 4. Code to determine the Thunderbolt generation from the PCI ID already exists in tb_switch_get_generation(). This will not only address the amdgpu issue you're trying to solve, but also emit an accurate speed from __pcie_print_link_status(). The speed you're reporting with your approach is not necessarily accurate because the next non-tunneled device in the hierarchy might be connected with a far higher PCIe speed than what the Thunderbolt fabric allows. Thanks, Lukas