From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3FD1C54E41 for ; Wed, 6 Mar 2024 22:14:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 51B806B0095; Wed, 6 Mar 2024 17:14:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CBAD6B009D; Wed, 6 Mar 2024 17:14:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 393996B009E; Wed, 6 Mar 2024 17:14:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 2A0866B0095 for ; Wed, 6 Mar 2024 17:14:10 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id F2FAD403E5 for ; Wed, 6 Mar 2024 22:14:09 +0000 (UTC) X-FDA: 81868018218.18.D937080 Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by imf14.hostedemail.com (Postfix) with ESMTP id 2144C100005 for ; Wed, 6 Mar 2024 22:14:06 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=none; spf=pass (imf14.hostedemail.com: domain of hch@lst.de designates 213.95.11.211 as permitted sender) smtp.mailfrom=hch@lst.de; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709763247; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/3/geJlHAvUbwYkCgoJhG+S6+t0cTirLphXUduy2mi8=; b=gyXIXIlQOFCA8nwlP8oukqx/+4FhZaxEtlGCezGzyZflIpo+BZDBuXhU9ItVvtSKGATrhj Aus1NmI9Ep7DT148nktX7XgaExZULWnT3u92H497Y065za16p8X2PsGe0xtrm5DtUx7rso BSzzrOFfFZYqoVcgVTY+5rFG9ISbgeU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709763247; a=rsa-sha256; cv=none; b=eiaoDCUOfO8lcK+vuMRmzuaJA7wwXLDsreU1cWWLAN6ogQ6z0cNInjDHG/MKwV+8elbYFy 52OlDEdprM12DiystU8zEEVd/MxuZn97sk2R7m/zHY5sSoLStqIx5gybwxyC/j5hU7B7VS ltbLSPRtmmwSdMKPRnUd0EVjiI4NxoY= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=none; spf=pass (imf14.hostedemail.com: domain of hch@lst.de designates 213.95.11.211 as permitted sender) smtp.mailfrom=hch@lst.de; dmarc=none Received: by verein.lst.de (Postfix, from userid 2407) id B087E68C4E; Wed, 6 Mar 2024 23:14:00 +0100 (CET) Date: Wed, 6 Mar 2024 23:14:00 +0100 From: Christoph Hellwig To: Jason Gunthorpe Cc: Christoph Hellwig , Leon Romanovsky , Robin Murphy , Marek Szyprowski , Joerg Roedel , Will Deacon , Chaitanya Kulkarni , Jonathan Corbet , Jens Axboe , Keith Busch , Sagi Grimberg , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , =?iso-8859-1?B?Suly9G1l?= Glisse , Andrew Morton , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, kvm@vger.kernel.org, linux-mm@kvack.org, Bart Van Assche , Damien Le Moal , Amir Goldstein , "josef@toxicpanda.com" , "Martin K. Petersen" , "daniel@iogearbox.net" , Dan Williams , "jack@suse.com" , Zhu Yanjun Subject: Re: [RFC RESEND 00/16] Split IOMMU DMA mapping operation to two steps Message-ID: <20240306221400.GA8663@lst.de> References: <47afacda-3023-4eb7-b227-5f725c3187c2@arm.com> <20240305122935.GB36868@unreal> <20240306144416.GB19711@lst.de> <20240306154328.GM9225@ziepe.ca> <20240306162022.GB28427@lst.de> <20240306174456.GO9225@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240306174456.GO9225@ziepe.ca> User-Agent: Mutt/1.5.17 (2007-11-01) X-Rspamd-Queue-Id: 2144C100005 X-Rspam-User: X-Stat-Signature: ci7u8xm4s9m9yf7cbkf6fdxsfbhzaq3z X-Rspamd-Server: rspam03 X-HE-Tag: 1709763246-616279 X-HE-Meta: U2FsdGVkX18lWCqoKUH8NUmQT08whFZIXYXOak19TzzMm2xoU9XBTwh00YTqrGpxw2HQJShTjwlf6rkJHWCnbpBfFPwUi4LTjbcX0W8rYwuUrsD1d36W+L9su2G34FAkaXDd4ffcghOrWfwRTa0LbFm7fR7CFjJayKUstnNaci/YYz57DMQxK9jiF4+2HPQD8cLe8b/hwLJ91HzJgvFMMBf5tm43QbeHaCeSWS7kKpCe5Dq8JIA+7pKxayk2mqNMlXvCflAwHOolcOIqs/uupLJ0MX/II27+WbnxVG6MqYAC1W3e6VtmsK0c5IvGAlrlz6ccskwSjtF3bLo4S0F53aO1cL5SKdb77TxIHzJPt32nYoIG5ge6BSbrJt81NNQ4s0trRD0tyRgS5hTFjzG9oM401Guefq9hCuNxUgCeSLVawJ+arnldMAueFlvTqjAK+TcQgd0MocodlWry88/vpsavEbHE3kvLlWb7IEADNpPzOpnGtcGfaZagmOiJgRGuO/cBhuXoc30KeUgBMABjmX5fIIMTx+Ypne8fryd+wLHnj7NOw8ZcYaY9i93R8v9Nk7V7eKnNrkVx0W/MCBOVydWnRyx+uoRpmA1ugmmvrz8aiE4dV5x/z+xK9fwtYMlrr6GQJ2qCNMEQw8XMgi+CMssCUcv+LvGaSshhaSot2YKoMZgyCr9/56Ri4o+5whdsXmOLBgv46KIxpgV2ELYuNJyCtQk6NEGLdipJ8XMU2i9F07zjhjb1Be7xQwF6xaTew5+phpUyulwNlqXEXl1sEiQX2jV3g9E/aFEa1u4ij8w/VQBXKLnx3GGnXoyRfuigeuby48VVXSwc5hQag6xbtibX3b+nYguTJxYCFH1apYWb2vEziGWdBEqFiNXLq9wSgcVnmEAomdYrYEB8TyCwwPMeGtUmYLkc++W3opMQAXirC2kA9FApfksOwejysrEhVsP6hy+sQy5aHPi+1SV eflxdb/0 o8ThOFEvwaQFAVIccSIkcWjidoGdJwUJA6tRe1TXBO5vQqsT3XLy82dxkS1GD+nomiEi5Nsk0aha6PFu6+n9fKnWmxA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Mar 06, 2024 at 01:44:56PM -0400, Jason Gunthorpe wrote: > There is a list of interesting cases this has to cover: > > 1. Direct map. No dma_addr_t at unmap, multiple HW SGLs > 2. IOMMU aligned map, no P2P. Only IOVA range at unmap, single HW SGLs > 3. IOMMU aligned map, P2P. Only IOVA range at unmap, multiple HW SGLs > 4. swiotlb single range. Only IOVA range at unmap, single HW SGL > 5. swiotlb multi-range. All dma_addr_t's at unmap, multiple HW SGLs. > 6. Unaligned IOMMU. Only IOVA range at unmap, multiple HW SGLs > > I think we agree that 1 and 2 should be optimized highly as they are > the common case. That mainly means no dma_addr_t storage in either I don't think you can do without dma_addr_t storage. In most cases your can just store the dma_addr_t in the LE/BE encoded hardware SGL, so no extra storage should be needed though. > 3 is quite similar to 1, but it has the IOVA range at unmap. Can you explain what P2P case you mean? The switch one with the bus address is indeed basically the same, just with potentioally a different offset, while the through host bridge case is the same as a normal iommu map. > > 4 is basically the same as 2 from the driver's viewpoint I'd actually treat it the same as one. > 5 is the slowest and has the most overhead. and 5 could be broken into multiple 4s at least for now. Or do you have a different dfinition of range here? > So are you thinking something more like a driver flow of: > > .. extent IO and get # aligned pages and know if there is P2P .. > dma_init_io(state, num_pages, p2p_flag) > if (dma_io_single_range(state)) { > // #2, #4 > for each io() > dma_link_aligned_pages(state, io range) > hw_sgl = (state->iova, state->len) > } else { I think what you have a dma_io_single_range should become before the dma_init_io. If we know we can't coalesce it really just is a dma_map_{single,page,bvec} loop, no need for any extra state. And we're back to roughly the proposal I sent out years ago. > This is not quite what you said, we split the driver flow based on > needing 1 HW SGL vs need many HW SGL. That's at least what I intended to say, and I'm a little curious as what it came across.