From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24D59C3271F for ; Fri, 5 Jul 2024 06:39:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 243EC6B0096; Fri, 5 Jul 2024 02:39:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1F2176B0098; Fri, 5 Jul 2024 02:39:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0E1BA6B0099; Fri, 5 Jul 2024 02:39:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E598B6B0096 for ; Fri, 5 Jul 2024 02:39:17 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 90310161700 for ; Fri, 5 Jul 2024 06:39:17 +0000 (UTC) X-FDA: 82304747154.02.FB1DFCC Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by imf09.hostedemail.com (Postfix) with ESMTP id 9D349140006 for ; Fri, 5 Jul 2024 06:39:15 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of hch@lst.de designates 213.95.11.211 as permitted sender) smtp.mailfrom=hch@lst.de; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720161530; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6dyuM1DptFCXj7X/3hii90axnhHw36YTzdof4v9W0uI=; b=2HSd74mCa0Sx5EjDmKq6xcJ7iT760aOJ/1+xBxKCQTQ5rA3m3RiZijDINutXBeIWP3wDQF wC9wlsl9KthtFSykYDcat6mdH83iH5PCSdi6NyQfXKT+y6RdLJqZcMH2ckD1NgWhwycrPZ QFviSZY9/DznPZH0EtnnuH0t5HbTYxQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720161530; a=rsa-sha256; cv=none; b=yut1BSnLYjCDXHmorhdPPYmrt9R86ALSBOaLHyGKXzw//IvwxUBxpsIV8W4fcPM9tOFSPe 2ceoLxb68qbkUctLYl8N2BX7Hmm8fdUJjkZnW1LISjINJ+y57E2cI8FxSPFVOhcF4vnzr4 NC/Vzfrgwxa+BQeFIAoN5KOlpNSQbic= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of hch@lst.de designates 213.95.11.211 as permitted sender) smtp.mailfrom=hch@lst.de; dmarc=none Received: by verein.lst.de (Postfix, from userid 2407) id B3F8468AA6; Fri, 5 Jul 2024 08:39:10 +0200 (CEST) Date: Fri, 5 Jul 2024 08:39:10 +0200 From: Christoph Hellwig To: Leon Romanovsky Cc: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?iso-8859-1?B?Suly9G1l?= Glisse , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: Re: [RFC PATCH v1 00/18] Provide a new two step DMA API mapping API Message-ID: <20240705063910.GA12337@lst.de> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 9D349140006 X-Stat-Signature: a1m4fjbu7xccnyq5iysyhdjjh4tfwhrz X-HE-Tag: 1720161555-900484 X-HE-Meta: U2FsdGVkX1/0Y+Ed245lnEv4F2KPV+EJt38IAYPRdPVkX/EkmwyjK+tSRi8eWQT08WEE9OnnIbe+f68FHWCYPubEIY5KqLk9nPL6hJok+medoOhq4irySTCXjS28IQN7IEa8GDvsXWbEwOEkNOKumhYcccCZe1vQL/tgaDvgv1tL9I3wTmHmcRcKq+q8OXJwTIeHsxKNhRvPbwvqr9vFtdUuxVU49ob8G7Z1HujfsLuLUuNB4ZJXpHMP1WiBtQ33CPLSaFapzXDY5VRISfOPqff2NHTiadHb3arMU6I3DGtMIOEfBIALaODkewq2x7vPhSxfID8Ym0efajTTiDEo2rglC6zFefZ2+jz8R0Efgp5ClUxA2zCacNVQ7ZXLA7Nxwj/PymT1VF064oPn7GvvcHP2FWA/z5yvrme6hsxXYC8fm0Ofox6tUjlyeiZRXr5Z2TRbcQWizDAyPuOKKnHE+8b8Oi6LIj5E5QMx6Fj3t/I3bYVZ4+FttJYzYkD8FHkdoQ0+MrIx3pChu5Fv4wRHGxuHpjnUyYkJbOINmHFTDpvb1oT9cTVDZimfOI1Uz1bgPc27XHhJ2Cx1IlRbab0d1GvR1UDt9ce97ApQ0wKS6Bg0D5yTZalKuGuSfVRjlhX7AZFIDe8NkG3EJ7ap4Obky1hMB86Xnz2c1ggumzFZa8RsMdfk8ilV84u09mQKrRnyb58x/SOcrnJS0QhpudUbRESMBaS3qmxnVLYpzRPVMWqjM56TR8yBkGWUBklT8XblDmkO4xrr75In+cLHDEGVxl2rORXwJCmj2PFHr0xzUkc0xl+iXXhidVmRg9yVTO4aMlARSdfRZoBNVqzL6Wu2HWcI8AAjCsPbO1TCGB/ViRWZ0l30b5t4LsU/C5bH5Vnwdn4Kd8ixCG4YU0ltE3W+dd+eqGQE6X9ZtCdxO3zkevR2PcUE1oa6XVYHRgI+50SS344jqCGTuBpFjpz/pyv 2JxD5f6a krF970UAVPgBJrPvj9gUhD+mzB2srqzo1adVSwAuA4+5hkk/nTtsuDIxAdNK21x2slBwA0YZJh+QqTw4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Review from the NVMe driver consumer perspective. I think if all these were implement we'd probably end up with less code than before the conversion. The split between dma_iova_attrs, dma_memory_type and dma_iova_state is odd. I would have expected them to just be just a single object. While talking about this I think the domain field in dma_iova_state should probably be a private pointer instead of being tied to the iommu. Also do we need the attrs member in the iova_attrs structure? The "attrs" really are flags passed to the mapping routines that are per-operation and not persistent, so I'd expect them to be passed per-call and not stored in a structure. I'd also expect that the use_iova field to be in the mapping state and not separately provided by the driver. For nvme specific data structures I would have expected a dma_add/ len pair in struct iod_dma_map, maybe even using a common type. Also the data structure split seems odd - I'd expect the actual mapping state and a small number (at least one) dma_addr/len pair to be inside the nvme_iod structure, and then only do the dynamic allocation if we need more of them because there are more segments and we are not using the iommu. If we had a common data structure for the dma_addr/len pairs dma_unlink_range could just take care of the unmap for the non-iommu case as well, which would be neat. I'd also expect that dma_free_iova would be covered by it. I would have expected dma_link_range to return the dma_addr_t instead of poking into the iova structure in the callers. In __nvme_rq_dma_map the <= PAGE_SIZE case is pointless. In the existing code the reason for it is to avoid allocating and mapping the sg_table, but that code is still left before we even get to this code. My suggestion above to only allocate the dma_addr/len pairs when there is more than 1 or a few of it would allow to trivially implement that suggestion using the normal API without having to keep that special case and the dma_len parameter around. If this addes a version of dma_map_page_atttrs that directly took the physical address as a prep patch the callers would not have to bother with page pointer manipulations and just work on physical addresses for both the iommu and no-iommu cases. It would also help a little bit with the eventualy switch to store the physical address instead of page+offset in the bio_vec. Talking about that, I've been wanting to add a bvec_phys helper for to convert the page_phys(bv.bv_page) + bv.bv_offset calculations. This is becoming more urgent with more callers needing to that, I'll try to get it out to Jens ASAP so that it can make the 6.11 merge window. Can we make dma_start_range / dma_end_range simple no-ops for the non-iommu code to avoid boilerplate code in the callers to avoid boilerplate code in the callers to deal with the two cases?