From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-qv1-f44.google.com (mail-qv1-f44.google.com [209.85.219.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0367313958C for ; Wed, 6 Mar 2024 17:44:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.44 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709747100; cv=none; b=ZvMkZyMT4Pr5tEZoAq5Qkt2yI+Ppa75hdDvRJzglw5PhJK4wxvf8VM+ZeVpBhp1bKH8DRo6j3/r/V3QqpY6ENNfB2hVibCY07rPFvnk5+U/9lUFNsEDe6Q2Z2+p4DvO43pKut40GgXgIAsGK8KAm4O5dNUyzOVDwZTtcRs4cHfw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709747100; c=relaxed/simple; bh=KgEC8zWw55+Acg7UImx7iG13lc0bEoNvPiGpELvl9iU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=EbKTi7H4MjniBsT8srEb5AyJhCClfGX8v2wE5VeGo7MzBhgm1yscjF4lOnnJ24CTTAFhazuck/nLK9n+Tp0lPzsC01gYU7KYUSm8z5KpZkoosiu4m+V1xEtdmQuWpw/SyP9hGKM2n4sW0Bn7dBt3E2q9OCR9OKPVusL5Zl+H7Yc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca; spf=pass smtp.mailfrom=ziepe.ca; dkim=pass (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b=MzdXd8Lc; arc=none smtp.client-ip=209.85.219.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ziepe.ca Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="MzdXd8Lc" Received: by mail-qv1-f44.google.com with SMTP id 6a1803df08f44-69012259657so31255276d6.0 for ; Wed, 06 Mar 2024 09:44:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; t=1709747098; x=1710351898; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=FpyRbZ0RdQZQRjC0Rn2UI7ybSBSrjRoc3WCER4LofKQ=; b=MzdXd8LcU01zr9pHMG00oFxvyTq5As1ywDlryOvXF5Skub+P2KTULXRPxMWl2G9ttK Amy8Ix4bPZpZKYozlIIU0pyBS65CgCMN2La54O1Balib3RVF+HMBWdz5Ss6L1QEGw93J Zef5kI9faBP5Lo53x38XQBXB/oV41Cbgq0l/JqoL1095Y+CnLzGPSvtSKAYsmDbDPpWR KNNxkzTfb4Rfic5BNNI8R8PQ06+v0dDNx/lSdIr0pEkz8dYF/8eA+ixANT3aJEAW5V0F RSsvriiGzNJIkKyxyyw76H3NxHYV6Xoe2gVaAUbvlGjro5t9NqURsx6RqRNbF5MAm+Mc nkJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709747098; x=1710351898; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=FpyRbZ0RdQZQRjC0Rn2UI7ybSBSrjRoc3WCER4LofKQ=; b=UMBU3NvnbO+31HzYcoPkuzSJIRftrjDRnuRgn5vyBtgy5SjOkB/FW35o1CJD2x2YEy p1tFHq7fAoDUetNeZATzUKDxIlsfb48MIS36Dx7VNw0CEu3sPXno/duJMdiMPGnk3ObO zUf6573svljQiJFtP37sJSDeRHuYrnBicrECJoDhjMm3BGwWMX2OWkPQ6tuYY0ktb7T+ vM+XFZfWXpS3HZG1Vc7lSifb8wyoRukEZrac4Hhs5t8fkWWIPqq13VSEPwsgtCEVSmKD 6dPP2FyGKQJo8KlWAW/y+AaTq5ezraIYHFrb/1eAPLftSCm9kMJrwQ7BnuTkOv0WhhRH K9EA== X-Forwarded-Encrypted: i=1; AJvYcCUGC30DoW2kp5VUP09gLSXM4lhRIFIwsvWT3hTh3r16zUXdXey0YASz9PhSVamqr7Ab2DiXEw+DH9h1S+SQ68kJSrvgXBVpkoq5SQ== X-Gm-Message-State: AOJu0YxzzY/9sFhwt6V0mRxlugkhWXPX0ujh0AptEX3kfiYwtJI4xrou TsKzwmMT0lvShcOrYjj5vgRJuUqEruan6w7Uynw2FaXpWDrLmmPNyqWvQhFeKIs= X-Google-Smtp-Source: AGHT+IFPBkf7L5QuuJTJ8Xv6bEQYd424i8376Z5+WOarEDc8ZpU3tKfW4UZ0uYoM0R8xl79r0aNmBg== X-Received: by 2002:a05:6214:11b1:b0:690:64e6:33d5 with SMTP id u17-20020a05621411b100b0069064e633d5mr5599360qvv.54.1709747097870; Wed, 06 Mar 2024 09:44:57 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-142-68-80-239.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.68.80.239]) by smtp.gmail.com with ESMTPSA id ol17-20020a0562143d1100b006904e2c9e36sm7228573qvb.116.2024.03.06.09.44.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Mar 2024 09:44:57 -0800 (PST) Received: from jgg by wakko with local (Exim 4.95) (envelope-from ) id 1rhvK4-001uSz-88; Wed, 06 Mar 2024 13:44:56 -0400 Date: Wed, 6 Mar 2024 13:44:56 -0400 From: Jason Gunthorpe To: Christoph Hellwig Cc: Leon Romanovsky , Robin Murphy , Marek Szyprowski , Joerg Roedel , Will Deacon , Chaitanya Kulkarni , Jonathan Corbet , Jens Axboe , Keith Busch , Sagi Grimberg , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , =?utf-8?B?SsOpcsO0bWU=?= Glisse , Andrew Morton , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, kvm@vger.kernel.org, linux-mm@kvack.org, Bart Van Assche , Damien Le Moal , Amir Goldstein , "josef@toxicpanda.com" , "Martin K. Petersen" , "daniel@iogearbox.net" , Dan Williams , "jack@suse.com" , Zhu Yanjun Subject: Re: [RFC RESEND 00/16] Split IOMMU DMA mapping operation to two steps Message-ID: <20240306174456.GO9225@ziepe.ca> References: <47afacda-3023-4eb7-b227-5f725c3187c2@arm.com> <20240305122935.GB36868@unreal> <20240306144416.GB19711@lst.de> <20240306154328.GM9225@ziepe.ca> <20240306162022.GB28427@lst.de> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240306162022.GB28427@lst.de> On Wed, Mar 06, 2024 at 05:20:22PM +0100, Christoph Hellwig wrote: > On Wed, Mar 06, 2024 at 11:43:28AM -0400, Jason Gunthorpe wrote: > > I don't think they are so fundamentally different, at least in our > > past conversations I never came out with the idea we should burden the > > driver with two different flows based on what kind of alignment the > > transfer happens to have. > > Then we talked past each other.. Well, we never talked to such detail > > > So if we want to efficiently be able to handle these cases we need > > > two APIs in the driver and a good framework to switch between them. > > > > But, what does the non-page-aligned version look like? Doesn't it > > still look basically like this? > > I'd just rather have the non-aligned case for those who really need > it be the loop over map single region that is needed for the direct > mapping anyway. There is a list of interesting cases this has to cover: 1. Direct map. No dma_addr_t at unmap, multiple HW SGLs 2. IOMMU aligned map, no P2P. Only IOVA range at unmap, single HW SGLs 3. IOMMU aligned map, P2P. Only IOVA range at unmap, multiple HW SGLs 4. swiotlb single range. Only IOVA range at unmap, single HW SGL 5. swiotlb multi-range. All dma_addr_t's at unmap, multiple HW SGLs. 6. Unaligned IOMMU. Only IOVA range at unmap, multiple HW SGLs I think we agree that 1 and 2 should be optimized highly as they are the common case. That mainly means no dma_addr_t storage in either 5 is the slowest and has the most overhead. 4 is basically the same as 2 from the driver's viewpoint 3 is quite similar to 1, but it has the IOVA range at unmap. 6 doesn't have to be optimal, from the driver perspective it can be like 5 That is three basic driver flows 1/3, 2/4 and 5/6 So are you thinking something more like a driver flow of: .. extent IO and get # aligned pages and know if there is P2P .. dma_init_io(state, num_pages, p2p_flag) if (dma_io_single_range(state)) { // #2, #4 for each io() dma_link_aligned_pages(state, io range) hw_sgl = (state->iova, state->len) } else { // #1, #3, #5, #6 hw_sgls = alloc_hw_sgls(num_ios) if (dma_io_needs_dma_addr_unmap(state)) dma_addr_storage = alloc_num_ios(); // #5 only for each io() hw_sgl[i] = dma_map_single(state, io range) if (dma_addr_storage) dma_addr_storage[i] = hw_sgl[i]; // #5 only } ? This is not quite what you said, we split the driver flow based on needing 1 HW SGL vs need many HW SGL. > > So are they really so different to want different APIs? That strikes > > me as a big driver cost. > > To not have to store a dma_address range per CPU range that doesn't > actually get used at all. Right, that is a nice optimization we should reach for. Jason