From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoph Hellwig Subject: Re: [PATCH v7 07/12] dma-mapping: introduce dma_has_iommu() Date: Fri, 13 Oct 2017 08:50:47 +0200 Message-ID: <20171013065047.GA26461@lst.de> References: <20171009185840.GB15336@obsidianresearch.com> <20171009191820.GD15336@obsidianresearch.com> <20171010172516.GA29915@obsidianresearch.com> <20171010180512.GA31734@obsidianresearch.com> <20171012182712.GA5772@obsidianresearch.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org To: Dan Williams Cc: Jason Gunthorpe , "linux-nvdimm@lists.01.org" , Jan Kara , Ashok Raj , "Darrick J. Wong" , linux-rdma@vger.kernel.org, Greg Kroah-Hartman , Joerg Roedel , Dave Chinner , linux-xfs@vger.kernel.org, Linux MM , Jeff Moyer , Linux API , linux-fsdevel , Ross Zwisler , David Woodhouse , Robin Murphy , Christoph Hellwig , Marek Szyprowski List-Id: linux-api@vger.kernel.org On Thu, Oct 12, 2017 at 01:10:33PM -0700, Dan Williams wrote: > On Thu, Oct 12, 2017 at 11:27 AM, Jason Gunthorpe > wrote: > > On Tue, Oct 10, 2017 at 01:17:26PM -0700, Dan Williams wrote: > > > >> Also keep in mind that what triggers the lease break is another > >> application trying to write or punch holes in a file that is mapped > >> for RDMA. So, if the hardware can't handle the iommu mapping getting > >> invalidated asynchronously and the application can't react in the > >> lease break timeout period then the administrator should arrange for > >> the file to not be written or truncated while it is mapped. > > > > That makes sense, but why not return ENOSYS or something to the app > > trying to alter the file if the RDMA hardware can't support this > > instead of having the RDMA app deal with this lease break weirdness? > > That's where I started, an inode flag that said "hands off, this file > is busy", but Christoph pointed out that we should reuse the same > mechanisms that pnfs is using. The pnfs protection scheme uses file > leases, and once the kernel decides that a lease needs to be broken / > layout needs to be recalled there is no stopping it, only delaying. That was just a suggestion - the important statement is that a hands off flag is just a no-go. > However, chatting this over with a few more people I have an alternate > solution that effectively behaves the same as how non-ODP hardware > handles this case of hole punch / truncation today. So, today if this > scenario happens on a page-cache backed mapping, the file blocks are > unmapped and the RDMA continues into pinned pages that are no longer > part of the file. We can achieve the same thing with the iommu, just > re-target the I/O into memory that isn't part of the file. That way > hardware does not see I/O errors and the DAX data consistency model is > no worse than the page-cache case. Yikes. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from newverein.lst.de (verein.lst.de [213.95.11.211]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 0252121F3882E for ; Thu, 12 Oct 2017 23:47:17 -0700 (PDT) Date: Fri, 13 Oct 2017 08:50:47 +0200 From: Christoph Hellwig Subject: Re: [PATCH v7 07/12] dma-mapping: introduce dma_has_iommu() Message-ID: <20171013065047.GA26461@lst.de> References: <20171009185840.GB15336@obsidianresearch.com> <20171009191820.GD15336@obsidianresearch.com> <20171010172516.GA29915@obsidianresearch.com> <20171010180512.GA31734@obsidianresearch.com> <20171012182712.GA5772@obsidianresearch.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: Dan Williams Cc: linux-xfs@vger.kernel.org, Jan Kara , Ashok Raj , "Darrick J. Wong" , linux-rdma@vger.kernel.org, Greg Kroah-Hartman , Joerg Roedel , "linux-nvdimm@lists.01.org" , Dave Chinner , Robin Murphy , Jason Gunthorpe , Linux MM , Linux API , linux-fsdevel , David Woodhouse , Christoph Hellwig , Marek Szyprowski List-ID: On Thu, Oct 12, 2017 at 01:10:33PM -0700, Dan Williams wrote: > On Thu, Oct 12, 2017 at 11:27 AM, Jason Gunthorpe > wrote: > > On Tue, Oct 10, 2017 at 01:17:26PM -0700, Dan Williams wrote: > > > >> Also keep in mind that what triggers the lease break is another > >> application trying to write or punch holes in a file that is mapped > >> for RDMA. So, if the hardware can't handle the iommu mapping getting > >> invalidated asynchronously and the application can't react in the > >> lease break timeout period then the administrator should arrange for > >> the file to not be written or truncated while it is mapped. > > > > That makes sense, but why not return ENOSYS or something to the app > > trying to alter the file if the RDMA hardware can't support this > > instead of having the RDMA app deal with this lease break weirdness? > > That's where I started, an inode flag that said "hands off, this file > is busy", but Christoph pointed out that we should reuse the same > mechanisms that pnfs is using. The pnfs protection scheme uses file > leases, and once the kernel decides that a lease needs to be broken / > layout needs to be recalled there is no stopping it, only delaying. That was just a suggestion - the important statement is that a hands off flag is just a no-go. > However, chatting this over with a few more people I have an alternate > solution that effectively behaves the same as how non-ODP hardware > handles this case of hole punch / truncation today. So, today if this > scenario happens on a page-cache backed mapping, the file blocks are > unmapped and the RDMA continues into pinned pages that are no longer > part of the file. We can achieve the same thing with the iommu, just > re-target the I/O into memory that isn't part of the file. That way > hardware does not see I/O errors and the DAX data consistency model is > no worse than the page-cache case. Yikes. _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from verein.lst.de ([213.95.11.211]:41844 "EHLO newverein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751024AbdJMGut (ORCPT ); Fri, 13 Oct 2017 02:50:49 -0400 Date: Fri, 13 Oct 2017 08:50:47 +0200 From: Christoph Hellwig Subject: Re: [PATCH v7 07/12] dma-mapping: introduce dma_has_iommu() Message-ID: <20171013065047.GA26461@lst.de> References: <20171009185840.GB15336@obsidianresearch.com> <20171009191820.GD15336@obsidianresearch.com> <20171010172516.GA29915@obsidianresearch.com> <20171010180512.GA31734@obsidianresearch.com> <20171012182712.GA5772@obsidianresearch.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-xfs-owner@vger.kernel.org List-ID: List-Id: xfs To: Dan Williams Cc: Jason Gunthorpe , "linux-nvdimm@lists.01.org" , Jan Kara , Ashok Raj , "Darrick J. Wong" , linux-rdma@vger.kernel.org, Greg Kroah-Hartman , Joerg Roedel , Dave Chinner , linux-xfs@vger.kernel.org, Linux MM , Jeff Moyer , Linux API , linux-fsdevel , Ross Zwisler , David Woodhouse , Robin Murphy , Christoph Hellwig , Marek Szyprowski On Thu, Oct 12, 2017 at 01:10:33PM -0700, Dan Williams wrote: > On Thu, Oct 12, 2017 at 11:27 AM, Jason Gunthorpe > wrote: > > On Tue, Oct 10, 2017 at 01:17:26PM -0700, Dan Williams wrote: > > > >> Also keep in mind that what triggers the lease break is another > >> application trying to write or punch holes in a file that is mapped > >> for RDMA. So, if the hardware can't handle the iommu mapping getting > >> invalidated asynchronously and the application can't react in the > >> lease break timeout period then the administrator should arrange for > >> the file to not be written or truncated while it is mapped. > > > > That makes sense, but why not return ENOSYS or something to the app > > trying to alter the file if the RDMA hardware can't support this > > instead of having the RDMA app deal with this lease break weirdness? > > That's where I started, an inode flag that said "hands off, this file > is busy", but Christoph pointed out that we should reuse the same > mechanisms that pnfs is using. The pnfs protection scheme uses file > leases, and once the kernel decides that a lease needs to be broken / > layout needs to be recalled there is no stopping it, only delaying. That was just a suggestion - the important statement is that a hands off flag is just a no-go. > However, chatting this over with a few more people I have an alternate > solution that effectively behaves the same as how non-ODP hardware > handles this case of hole punch / truncation today. So, today if this > scenario happens on a page-cache backed mapping, the file blocks are > unmapped and the RDMA continues into pinned pages that are no longer > part of the file. We can achieve the same thing with the iommu, just > re-target the I/O into memory that isn't part of the file. That way > hardware does not see I/O errors and the DAX data consistency model is > no worse than the page-cache case. Yikes.