From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoph Hellwig Subject: Re: [PATCH v9 0/6] MAP_DIRECT for DAX userspace flush Date: Mon, 16 Oct 2017 09:30:12 +0200 Message-ID: <20171016073012.GC28270@lst.de> References: <150776922692.9144.16963640112710410217.stgit@dwillia2-desk3.amr.corp.intel.com> <20171012142319.GA11254@lst.de> <20171013065716.GB26461@lst.de> <20171013163822.GA17411@obsidianresearch.com> <20171013173145.GA18702@obsidianresearch.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" To: Dan Williams Cc: linux-xfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Jan Kara , Andy Lutomirski , Arnd Bergmann , "linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org" , Linux API , "Darrick J. Wong" , Dave Chinner , Andrew Morton , Jason Gunthorpe , Linux MM , Al Viro , "J. Bruce Fields" , Jeff Layton , linux-fsdevel , Linus Torvalds , Christoph Hellwig List-Id: linux-api@vger.kernel.org On Fri, Oct 13, 2017 at 11:22:21AM -0700, Dan Williams wrote: > So, here's a strawman can ibv_poll_cq() start returning ibv_wc_status > == IBV_WC_LOC_PROT_ERR when file coherency is lost. This would make > the solution generic across DAX and non-DAX. What's you're feeling for > how well applications are prepared to deal with that status return? The problem aren't local protection errors, but remote protection errors when we modify a MR with an rkey that the remote side accesses. > > - How lease break can be done hitlessly, so the library user never > > needs to know it is happening or see failed/missed transfers > > iommu redirect should be hit less and behave like the page cache case > where RDMA targets pages that are no longer part of the file. But systems that care about performance (e.g. the usual RDMA users) usually don't use an IOMMU due to the performance impact. Especially as HCAs already have their own built-in iommus (aka the MR mechanism). Note that file systems already have a mechanism like you mention above to keep extents that are busy from being reallocated. E.g. take a look at fs/xfs/xfs_extent_busy.c. The downside is that this could lock down a massive amount of space in the busy list if we for example have a MR covering a huge file that is truncated down. So even if we'd want that scheme we'd need some sort of ulmit for the amount of DAX pages locked down in get_user_pages. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Mon, 16 Oct 2017 09:30:12 +0200 From: Christoph Hellwig Subject: Re: [PATCH v9 0/6] MAP_DIRECT for DAX userspace flush Message-ID: <20171016073012.GC28270@lst.de> References: <150776922692.9144.16963640112710410217.stgit@dwillia2-desk3.amr.corp.intel.com> <20171012142319.GA11254@lst.de> <20171013065716.GB26461@lst.de> <20171013163822.GA17411@obsidianresearch.com> <20171013173145.GA18702@obsidianresearch.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org To: Dan Williams Cc: Jason Gunthorpe , Christoph Hellwig , "linux-nvdimm@lists.01.org" , linux-xfs@vger.kernel.org, Jan Kara , Arnd Bergmann , "Darrick J. Wong" , Linux API , Dave Chinner , "J. Bruce Fields" , Linux MM , Jeff Moyer , Al Viro , Andy Lutomirski , Ross Zwisler , linux-fsdevel , Jeff Layton , Linus Torvalds , Andrew Morton List-ID: On Fri, Oct 13, 2017 at 11:22:21AM -0700, Dan Williams wrote: > So, here's a strawman can ibv_poll_cq() start returning ibv_wc_status > == IBV_WC_LOC_PROT_ERR when file coherency is lost. This would make > the solution generic across DAX and non-DAX. What's you're feeling for > how well applications are prepared to deal with that status return? The problem aren't local protection errors, but remote protection errors when we modify a MR with an rkey that the remote side accesses. > > - How lease break can be done hitlessly, so the library user never > > needs to know it is happening or see failed/missed transfers > > iommu redirect should be hit less and behave like the page cache case > where RDMA targets pages that are no longer part of the file. But systems that care about performance (e.g. the usual RDMA users) usually don't use an IOMMU due to the performance impact. Especially as HCAs already have their own built-in iommus (aka the MR mechanism). Note that file systems already have a mechanism like you mention above to keep extents that are busy from being reallocated. E.g. take a look at fs/xfs/xfs_extent_busy.c. The downside is that this could lock down a massive amount of space in the busy list if we for example have a MR covering a huge file that is truncated down. So even if we'd want that scheme we'd need some sort of ulmit for the amount of DAX pages locked down in get_user_pages. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from verein.lst.de ([213.95.11.211]:53512 "EHLO newverein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750880AbdJPHaO (ORCPT ); Mon, 16 Oct 2017 03:30:14 -0400 Date: Mon, 16 Oct 2017 09:30:12 +0200 From: Christoph Hellwig Subject: Re: [PATCH v9 0/6] MAP_DIRECT for DAX userspace flush Message-ID: <20171016073012.GC28270@lst.de> References: <150776922692.9144.16963640112710410217.stgit@dwillia2-desk3.amr.corp.intel.com> <20171012142319.GA11254@lst.de> <20171013065716.GB26461@lst.de> <20171013163822.GA17411@obsidianresearch.com> <20171013173145.GA18702@obsidianresearch.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-xfs-owner@vger.kernel.org List-ID: List-Id: xfs To: Dan Williams Cc: Jason Gunthorpe , Christoph Hellwig , "linux-nvdimm@lists.01.org" , linux-xfs@vger.kernel.org, Jan Kara , Arnd Bergmann , "Darrick J. Wong" , Linux API , Dave Chinner , "J. Bruce Fields" , Linux MM , Jeff Moyer , Al Viro , Andy Lutomirski , Ross Zwisler , linux-fsdevel , Jeff Layton , Linus Torvalds , Andrew Morton On Fri, Oct 13, 2017 at 11:22:21AM -0700, Dan Williams wrote: > So, here's a strawman can ibv_poll_cq() start returning ibv_wc_status > == IBV_WC_LOC_PROT_ERR when file coherency is lost. This would make > the solution generic across DAX and non-DAX. What's you're feeling for > how well applications are prepared to deal with that status return? The problem aren't local protection errors, but remote protection errors when we modify a MR with an rkey that the remote side accesses. > > - How lease break can be done hitlessly, so the library user never > > needs to know it is happening or see failed/missed transfers > > iommu redirect should be hit less and behave like the page cache case > where RDMA targets pages that are no longer part of the file. But systems that care about performance (e.g. the usual RDMA users) usually don't use an IOMMU due to the performance impact. Especially as HCAs already have their own built-in iommus (aka the MR mechanism). Note that file systems already have a mechanism like you mention above to keep extents that are busy from being reallocated. E.g. take a look at fs/xfs/xfs_extent_busy.c. The downside is that this could lock down a massive amount of space in the busy list if we for example have a MR covering a huge file that is truncated down. So even if we'd want that scheme we'd need some sort of ulmit for the amount of DAX pages locked down in get_user_pages.