From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 592F7C30658 for ; Tue, 2 Jul 2024 17:42:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=EJlxnDnveIIPsH6h0U7IJ0LM7Ki9pxIpJqLvXlyi080=; b=4kiNMedp/DBOZ2/+VrKsFCpytE eDvBobYh7qwC+u6lsvm5ADCqOKkGx4snzJqdLA5CYBTv5WoHufKmtuCmU6n98MrQ4Wl6jacv8KVz7 fiNZVPsWAxsH6DqYJmmqLBIHyu/1HQZP9zGWuzzy+EXK9nm1540lj1xaBuGJT0vPv6Vmtu9R5dIdL SWN525hyV+DWw8hSablKSmqjCzF/kWGmjApusbQZ6ltlT/hAiU7Dz33s2dmC6o388Q4WvapXnjpJ+ gqRA5L+dh+lu8deiJMlf7a/thdTFQTrMeyl2SP4ZPTYPXNv8HGx6v91DNOtJEHGAaHNFM8zbwAdO8 +jZsjJoQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sOhWB-00000007XYh-04ZK; Tue, 02 Jul 2024 17:42:15 +0000 Received: from sin.source.kernel.org ([145.40.73.55]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sOhVy-00000007XVS-0WFx for linux-arm-kernel@lists.infradead.org; Tue, 02 Jul 2024 17:42:03 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id CAFB4CE013C; Tue, 2 Jul 2024 17:41:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5AB19C116B1; Tue, 2 Jul 2024 17:41:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1719942118; bh=V1br44Suz4Wiyv69CA10ClMypjB26LNaQ5yMRsJPc48=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=epCLP+YHoh28Xp9koK/so0cTTVZFe1iZfYFnoERH6CX1YHknZdN6Mk10gChe1aDzO 73/1uPwfyJN/Otm3uEZKqtSYL5vcXomG+Kjl7Vi5zWLpuNKAp2fdJuce4r3IO+7oEg c5XIx+EPUQAcmilIWXFV5Mh/M8UhtmevEb4D7p1dD7IoyLAGGitIrLs1XPYPG/j0Xt 8XCDmHM8wixDLgKYka8J1+YVuPF00bKLdByBpZ5lmSsc1/hTkLm+LW3Q7ZbrFkbRlA JK6Hso2PLvpV/8lAVholV08Af4ln1NcaC404pJ0IPPC3/X6h+bVgYlK23bW0eQCR85 +XDbu66xlnblQ== Date: Tue, 2 Jul 2024 18:41:53 +0100 From: Will Deacon To: Nicolin Chen Cc: robin.murphy@arm.com, joro@8bytes.org, jgg@nvidia.com, thierry.reding@gmail.com, vdumpa@nvidia.com, jonathanh@nvidia.com, linux-kernel@vger.kernel.org, iommu@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-tegra@vger.kernel.org Subject: Re: [PATCH v9 5/6] iommu/arm-smmu-v3: Add in-kernel support for NVIDIA Tegra241 (Grace) CMDQV Message-ID: <20240702174152.GA4740@willie-the-truck> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240702_104202_561939_7A49E418 X-CRM114-Status: GOOD ( 41.16 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, Jun 12, 2024 at 02:45:32PM -0700, Nicolin Chen wrote: > From: Nate Watterson > > NVIDIA's Tegra241 Soc has a CMDQ-Virtualization (CMDQV) hardware, extending > the standard ARM SMMU v3 IP to support multiple VCMDQs with virtualization > capabilities. In terms of command queue, they are very like a standard SMMU > CMDQ (or ECMDQs), but only support CS_NONE in the CS field of CMD_SYNC. > > Add a new tegra241-cmdqv driver, and insert its structure pointer into the > existing arm_smmu_device, and then add related function calls in the SMMUv3 > driver to interact with the CMDQV driver. > > In the CMDQV driver, add a minimal part for the in-kernel support: reserve > VINTF0 for in-kernel use, and assign some of the VCMDQs to the VINTF0, and > select one VCMDQ based on the current CPU ID to execute supported commands. > This multi-queue design for in-kernel use gives some limited improvements: > up to 20% reduction of invalidation time was measured by a multi-threaded > DMA unmap benchmark, compared to a single queue. > > The other part of the CMDQV driver will be user-space support that gives a > hypervisor running on the host OS to talk to the driver for virtualization > use cases, allowing VMs to use VCMDQs without trappings, i.e. no VM Exits. > This is designed based on IOMMUFD, and its RFC series is also under review. > It will provide a guest OS a bigger improvement: 70% to 90% reductions of > TLB invalidation time were measured by DMA unmap tests running in a guest, > compared to nested SMMU CMDQ (with trappings). > > However, it is very important for this in-kernel support to get merged and > installed to VMs running on Grace-powered servers as soon as possible. So, > later those servers would only need to upgrade their host kernels for the > user-space support. ^^^ This is a weird paragraph to put in the commit message. > > As the initial version, the CMDQV driver only supports ACPI configurations. > > Signed-off-by: Nate Watterson > Reviewed-by: Jason Gunthorpe > Co-developed-by: Nicolin Chen > Signed-off-by: Nicolin Chen > --- > MAINTAINERS | 1 + > drivers/iommu/Kconfig | 11 + > drivers/iommu/arm/arm-smmu-v3/Makefile | 1 + > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 52 +- > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 50 ++ > .../iommu/arm/arm-smmu-v3/tegra241-cmdqv.c | 842 ++++++++++++++++++ > 6 files changed, 945 insertions(+), 12 deletions(-) > create mode 100644 drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c > > diff --git a/MAINTAINERS b/MAINTAINERS > index aacccb376c28..ecf7af1b2df8 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -22078,6 +22078,7 @@ M: Thierry Reding > R: Krishna Reddy > L: linux-tegra@vger.kernel.org > S: Supported > +F: drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c > F: drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c > F: drivers/iommu/tegra* > > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig > index c04584be3089..e009387d3cba 100644 > --- a/drivers/iommu/Kconfig > +++ b/drivers/iommu/Kconfig > @@ -423,6 +423,17 @@ config ARM_SMMU_V3_KUNIT_TEST > Enable this option to unit-test arm-smmu-v3 driver functions. > > If unsure, say N. > + > +config TEGRA241_CMDQV > + bool "NVIDIA Tegra241 CMDQ-V extension support for ARM SMMUv3" > + depends on ACPI > + help > + Support for NVIDIA CMDQ-Virtualization extension for ARM SMMUv3. The > + CMDQ-V extension is similar to v3.3 ECMDQ for multi command queues > + support, except with virtualization capabilities. > + > + Say Y here if your system is NVIDIA Tegra241 (Grace) or it has the same > + CMDQ-V extension. > endif > > config S390_IOMMU > diff --git a/drivers/iommu/arm/arm-smmu-v3/Makefile b/drivers/iommu/arm/arm-smmu-v3/Makefile > index 014a997753a8..55201fdd7007 100644 > --- a/drivers/iommu/arm/arm-smmu-v3/Makefile > +++ b/drivers/iommu/arm/arm-smmu-v3/Makefile > @@ -2,6 +2,7 @@ > obj-$(CONFIG_ARM_SMMU_V3) += arm_smmu_v3.o > arm_smmu_v3-objs-y += arm-smmu-v3.o > arm_smmu_v3-objs-$(CONFIG_ARM_SMMU_V3_SVA) += arm-smmu-v3-sva.o > +arm_smmu_v3-objs-$(CONFIG_TEGRA241_CMDQV) += tegra241-cmdqv.o > arm_smmu_v3-objs := $(arm_smmu_v3-objs-y) > > obj-$(CONFIG_ARM_SMMU_V3_KUNIT_TEST) += arm-smmu-v3-test.o > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > index ba0e24d5ffbf..430e84fe3679 100644 > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > @@ -334,6 +334,9 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) > > static struct arm_smmu_cmdq *arm_smmu_get_cmdq(struct arm_smmu_device *smmu) > { > + if (arm_smmu_has_tegra241_cmdqv(smmu)) > + return tegra241_cmdqv_get_cmdq(smmu); > + > return &smmu->cmdq; Hardcoding all these tegra-specific checks in the core driver is pretty horrible :/ Instead, please can we do something similar to the SMMUv2 driver? That is, tweak the probe routine to call something akin to the arm_smmu_impl_init() function, which looks at the 'model' field pulled out of the IORT and can then dispatch directly to a tegra-specific init function (see, e.g. nvidia_smmu_impl_init() for SMMUv2). >From there, you can both install function pointers into the 'arm_smmu_device' structure which can be used instead of having the the 'if (tegra)' checks in the main driver and you can also re-allocate the structu to live inside a private structure instead of having the backpointer. Maybe those cmdq function pointers would be happy for other extensions too (e.g. the ECMDQ stuff at [1]). Will [1] https://lore.kernel.org/r/20240425144152.52352-3-tanmay@marvell.com