From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12695C369CB for ; Tue, 29 Apr 2025 05:46:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7292D6B0007; Tue, 29 Apr 2025 01:46:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D7246B0008; Tue, 29 Apr 2025 01:46:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 59D8B6B000A; Tue, 29 Apr 2025 01:46:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 3CA2A6B0007 for ; Tue, 29 Apr 2025 01:46:10 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 90FC7C0D38 for ; Tue, 29 Apr 2025 05:46:10 +0000 (UTC) X-FDA: 83385995700.11.A472FE9 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf26.hostedemail.com (Postfix) with ESMTP id E4774140004 for ; Tue, 29 Apr 2025 05:46:08 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=OPIJHlt8; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf26.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745905569; a=rsa-sha256; cv=none; b=i8bKczMwj8/ziJyhnvlF9SYkk8/YCYXa3HydSLR/w8asAHHxiyFL/i16hFQmLzMuFAzbub n2PWMsWi1JHExiGQfTlXWxk35iF83WW9YauKWfKFGVcYfhLMmBpeGcMw1x3EUfcsNexO1K gxyxHhUJ+ZbVZrvySCVlaGJeRRKVjgQ= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=OPIJHlt8; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf26.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745905569; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SFgrjIY+SQtCPii0hJBqyXHSTgDlXLalpbQltj/vhsU=; b=K1DkGP9X6/+iGJY19cyriNjRji4AsYMF7HtptKEyz7NGTOMfNWKjKj+yhCeP1hK7TtPFRT JuGBBcqSxAPO91IAsJkxxsnszDSURcdHVWeCTibY8IaVfjGueI7BVNGKVxIXSz2guyE4vr FJXZ98QROXCg1pzZPtDw9jPFpA+2AZU= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 65190A4B7A4; Tue, 29 Apr 2025 05:40:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1EF06C4CEE3; Tue, 29 Apr 2025 05:46:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745905567; bh=ZlzjgR8ZCmhh7ln3bWsTRkN7njOkthtsCcGhOVJD370=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=OPIJHlt8bTW2SZgZvCQl0TtC/C7oWnwza0G162u6trNqiDyxNtzHYyjIBFxHIMyHP kNYqu6Gd4ME2eiPecfo02OZnUJNpPqZ11ior8m1BTlK5ogMz8NZw8SNe7GhZl9kwTJ AE9Uu94aifbn/PGHOYs6Xv+jxIIqUi/Vde5glUd+VZ5LXI9CAkEAbHZUqkqCHxgTf8 5FlvFRf8qZr6a1Vlk2E6XTcc/QgHu16QEmaS6D/ZS8tOk76ZJl2EWdi8dagzkXZD/z cAVfMvTVwCBZimzwv4w3tYH/7v26KFBOlhohXUM8B1HoNMgXXtnUSEr5ZpJvKIxX2N MhsioMleKoeXg== Date: Tue, 29 Apr 2025 08:46:02 +0300 From: Leon Romanovsky To: Baolu Lu Cc: Marek Szyprowski , Jens Axboe , Christoph Hellwig , Keith Busch , Jake Edge , Jonathan Corbet , Jason Gunthorpe , Zhu Yanjun , Robin Murphy , Joerg Roedel , Will Deacon , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , =?iso-8859-1?B?Suly9G1l?= Glisse , Andrew Morton , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, Niklas Schnelle , Chuck Lever , Luis Chamberlain , Matthew Wilcox , Dan Williams , Kanchan Joshi , Chaitanya Kulkarni Subject: Re: [PATCH v10 05/24] dma-mapping: Provide an interface to allow allocate IOVA Message-ID: <20250429054602.GI5848@unreal> References: <30f0601d400711b3859deeb8fef3090f5b2020a4.1745831017.git.leon@kernel.org> <0086302d-1cb3-43dd-a989-e4b1995a0d22@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0086302d-1cb3-43dd-a989-e4b1995a0d22@linux.intel.com> X-Rspam-User: X-Rspamd-Queue-Id: E4774140004 X-Rspamd-Server: rspam04 X-Stat-Signature: d9prc8y7b5bsstqdtdb6i45cndk91ra3 X-HE-Tag: 1745905568-933737 X-HE-Meta: U2FsdGVkX185GMAti9TOfJjPHjhU1DWYVWF1eAE4BLU9vYhAtbxNASolleBEF6YgY5vn37KpPB3/IqyHVCqoYMxzQSEmV2rQ/gYb60uSIK5HdTue7dny2PiwvmcJS/muIJvkLfHD8O3mz6d9kG5FUkfqQJ8mxW4jRznEcEL+lLsYx3rpItXJe4dvWCh/eUJJlqLZKeL1G0obHM6dRMcFhNpCO18Sej73r9BxR3JctPCpEBAnrsR5ALO0feHhsu7I34BeeXEHP5mbLdStI9yT7W/DiaeL7yjyij47Iizfr5ZR18lIX68xL4ezvAbu3Kkue60+0pVuJsXcYoD2MBBgkjb6goycJIyLKig+gmzN1skNEcCBD2zbiHTvgSz0cLdsCm3v+2uXCC1896fKdwC5UIIpkvf8U6kD10bo82P2f8pNQ+vZx7Sgu4jFVkWYXnxvXyYZEEi7V4WnRvV9YiSAaH4m3qDRnoonYDnN4XBsfazNUekbsR0Il+LJQ/bvzx79fC/7jF4BeVBUyNV3KWJXUnUKhjCJJ1AyrRSOS+U1h6dRldHTmCVuIJFg0t2qAwojo44iQPn2Unbvkx68CCtbaGyOtM85CaeYtpArXJYQlQ8Spw2M4cU+k75K5i8RowbGs/MDC1ZvPK7y/ZDOticLiiZx0MjcREqMLbGNPcR5tz+kTx3WnRtvnNxniF3TyYRkIAoP2M9dVofNnBqLsxNUweyzB9Ztfp9pd4eH123JNMBfnTfk6B/1KItwwWoZX8rrJUn6usLB7BaMafCxa67lWLJyyFetsB9tdxE+smTV2uEsBlSxS3UCZ3SJjG02X8c7T0wPmxVSeqwg9WqnKkoQxfe0HkfSUs1eFDCWwqa5I80f5uvUsY6X7a3AEPP6OlJd1Xw3XxSOPcsZaWW1vLjuTdqZje1XPQehNCisgiKX4bWGm2nPL+NB0olzU24xE8bfz2tjzZRwuKzsM0CTSnM 2nvLUAq9 2htmEZeAODRcDBp9UdzrJgszcLgGR+1YJwWHVRiWT8yJOyHFLKrCnAJkmtJQwNLLBUbi5e/+Vo3p6AbV0Yksd3nhi47z8K21VdOZ44cPWfs/vo1t3OnTKjBqminv7H1MTXwjL49MBh7VKJFnEzo0pWtWTGS9SQtO858GY0meeT5z3s9O03b2bcoX/F8vao4N0e4kuHzvH6Tj37zK3BnM4Ug1v33Y72WLgzyTpE6JqMr85IgSo2gOcWzl8GAj1WTizNsfbgpV8mDSTCwK9zKhRlMT7E8BVXIcyIX3TgrDpGTPHkAQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Apr 29, 2025 at 11:10:54AM +0800, Baolu Lu wrote: > On 4/28/25 17:22, Leon Romanovsky wrote: > > From: Leon Romanovsky > > > > The existing .map_page() callback provides both allocating of IOVA > > .map_pages() Changed, thanks > > > and linking DMA pages. That combination works great for most of the > > callers who use it in control paths, but is less effective in fast > > paths where there may be multiple calls to map_page(). > > > > These advanced callers already manage their data in some sort of > > database and can perform IOVA allocation in advance, leaving range > > linkage operation to be in fast path. > > > > Provide an interface to allocate/deallocate IOVA and next patch > > link/unlink DMA ranges to that specific IOVA. > > > > In the new API a DMA mapping transaction is identified by a > > struct dma_iova_state, which holds some recomputed information > > for the transaction which does not change for each page being > > mapped, so add a check if IOVA can be used for the specific > > transaction. > > > > The API is exported from dma-iommu as it is the only implementation > > supported, the namespace is clearly different from iommu_* functions > > which are not allowed to be used. This code layout allows us to save > > function call per API call used in datapath as well as a lot of boilerplate > > code. > > > > Reviewed-by: Christoph Hellwig > > Tested-by: Jens Axboe > > Reviewed-by: Luis Chamberlain > > Signed-off-by: Leon Romanovsky > > --- > > drivers/iommu/dma-iommu.c | 86 +++++++++++++++++++++++++++++++++++++ > > include/linux/dma-mapping.h | 48 +++++++++++++++++++++ > > 2 files changed, 134 insertions(+) > > > > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > > index 9ba8d8bc0ce9..d3211a8d755e 100644 > > --- a/drivers/iommu/dma-iommu.c > > +++ b/drivers/iommu/dma-iommu.c > > @@ -1723,6 +1723,92 @@ size_t iommu_dma_max_mapping_size(struct device *dev) > > return SIZE_MAX; > > } > > +/** > > + * dma_iova_try_alloc - Try to allocate an IOVA space > > + * @dev: Device to allocate the IOVA space for > > + * @state: IOVA state > > + * @phys: physical address > > + * @size: IOVA size > > + * > > + * Check if @dev supports the IOVA-based DMA API, and if yes allocate IOVA space > > + * for the given base address and size. > > + * > > + * Note: @phys is only used to calculate the IOVA alignment. Callers that always > > + * do PAGE_SIZE aligned transfers can safely pass 0 here. > > Have you considered adding a direct alignment parameter to > dma_iova_try_alloc()? '0' simply means the default PAGE_SIZE alignment. > > I'm imagining that some devices might have particular alignment needs > for better performance, especially for the ATS cache efficiency. This > would allow those device drivers to express the requirements directly > during iova allocation. This is actually what is happening now, take a look in blk_rq_dma_map_iter_start() implementation, which uses custom alignment. > > > + * > > + * Returns %true if the IOVA-based DMA API can be used and IOVA space has been > > + * allocated, or %false if the regular DMA API should be used. > > + */ > > +bool dma_iova_try_alloc(struct device *dev, struct dma_iova_state *state, > > + phys_addr_t phys, size_t size) > > +{ > > + struct iommu_dma_cookie *cookie; > > + struct iommu_domain *domain; > > + struct iova_domain *iovad; > > + size_t iova_off; > > + dma_addr_t addr; > > + > > + memset(state, 0, sizeof(*state)); > > + if (!use_dma_iommu(dev)) > > + return false; > > + > > + domain = iommu_get_dma_domain(dev); > > + cookie = domain->iova_cookie; > > + iovad = &cookie->iovad; > > + iova_off = iova_offset(iovad, phys); > > + > > + if (static_branch_unlikely(&iommu_deferred_attach_enabled) && > > + iommu_deferred_attach(dev, iommu_get_domain_for_dev(dev))) > > + return false; > > + > > + if (WARN_ON_ONCE(!size)) > > + return false; > > + > > + /* > > + * DMA_IOVA_USE_SWIOTLB is flag which is set by dma-iommu > > + * internals, make sure that caller didn't set it and/or > > + * didn't use this interface to map SIZE_MAX. > > + */ > > + if (WARN_ON_ONCE((u64)size & DMA_IOVA_USE_SWIOTLB)) > > I'm a little concerned that device drivers might inadvertently misuse > the state->__size by forgetting about the high bit being used for > DMA_IOVA_USE_SWIOTLB. Perhaps adding a separate flag within struct > dma_iova_state to prevent such issues? Device drivers are not supposed to use this DMA API interface and the vision that subsystems will provide specific to them wrappers. See HMM, and block changes as an example. VFIO mlx5 implementation is a temporary measure till we convert another VFIO LM driver to get understanding what type of abstraction we will need. The high bit is used to save memory. > > > + return false; > > + > > + addr = iommu_dma_alloc_iova(domain, > > + iova_align(iovad, size + iova_off), > > + dma_get_mask(dev), dev); > > + if (!addr) > > + return false; > > + > > + state->addr = addr + iova_off; > > + state->__size = size; > > + return true; > > +} > > +EXPORT_SYMBOL_GPL(dma_iova_try_alloc); > > Thanks, > baolu >